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Abstract 


Many  engineering  or  matheffladcal  prablems  require  to  factorize  smicnited  matrices  (Toe- 
plitz,  Hankel.  Vandermonde,  products  of  such  matrices  and  their  inveises.  Schur  complemens. 
etc)  either  in  explicit  or  in  disguised  foim.  Consequently  there  exist  various  analytic  tools 
legaiding  smictuied  matrices  as  well  as  several  fast  ftctorizadon  algorithms.  In  this  thesis,  we 
show  that  many  of  these  results  and  several  significant  generalizations  can  be  obtained  in  a 
very  consaucdve  way.  The  generic  form  is  to  use  elementary  circular  and  hyperbolic  transfor¬ 
mations  to  triangulatize  a  certain  array  of  numbers  derived  linm  the  displacemem  representa¬ 
tion  of  the  given  structured  matrix;  the  desired  results  can  then  be  read  off  from  the  resulting 
array.  These  "fast  array  algorithms*  require  0(mii)  operations  for  LU  and  QR  factorizations  of 
m  X  /I  structured  matrices,  and  O(mii)  or  even  O(alog^ii)  operations  for  solving  matrix  equa¬ 
tions.  Also  the  atray  form  suggests  various  alternative  algorithms,  depending  upon  the  order  in 
which  the  transfoimations  ate  applied:  these  variations  can  have  diffetem  numerical  propeities 
and  lead  to  different  impiemenatkni. 

Our  algorithm  is  based  on  a  generalized  definition  of  displacxmcm  for  Uodt-Toeplitz 
(Hankel)  and  Toepiitz  (Hankel)-block  matrices  slif^y  extending  the  previous  definitions  of 
Kailath.  Kung  and  Morf  (1979)  and  Lev-Aii  and  Kailath  (1984).  An  important  propeny  of 
displacement  stiticniic  is  that  it  is  preserved  under  Schur  complementations.  It  will  nun  out 
that  Toepliiz-(Haiikd)-derived  (near-ToepUiz.  ToepUa-Uke.  etc)  manices  are  perhaps  best 
regarded  as  particular  Schur  ccmpiemcra  obtained  hom  suiiahiy  defined  biodt  mairioeaL  The 
displacemem  stntcmte  is  used  *o  obtain  a  Schur  algorithm  (br  the  fast  triangular 

and  orthogonal  factorizations  of  all  such  matrices,  and  weU  sanciuted  fast  solutions  of  the 


corresponding  exes  and  cwettirsrttnined  sysiems  of  linear  equntnrs. 
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Abstract 

Many  engineering  or  mathematical  problems  require  to  factorize  structured  matrices  (Toe- 
plitz,  Hankel.  Vandennonde,  products  of  such  matrices  and  their  inverses.  Schur  complements, 
etc)  either  in  explicit  or  in  disguised  form.  Consequently  there  exist  various  analytic  tools 
regarding  structured  matrices  as  well  as  several  fast  factorization  algorithms.  In  this  thesis,  we 
show  that  many  of  these  results  and  several  significant  generalizations  can  be  obtained  in  a 
very  constructive  way.  The  generic  form  is  to  use  elementary  circular  and  hyperbolic  transfor¬ 
mations  to  triangularize  a  certain  array  of  numbers  derived  from  the  displacement  representa¬ 
tion  of  the  given  structured  matrix;  the  desired  results  can  then  be  read  off  from  the  resulting 
array.  These  "fast  array  algorithms"  requite  0{mn)  operations  for  LU  and  QR  factorizations  of 
m  X  n  structured  matrices,  and  0(mn)  or  even  0{n\o^n)  operations  for  solving  matrix  equa¬ 
tions.  Also  the  array  form  suggests  various  alternative  algorithms,  (Spending  upon  the  order  in 
which  the  transformations  are  applied;  these  variations  can  have  different  numerical  properties 
and  lead  to  different  implementations. 

Our  algorithm  is  based  on  a  generalized  definition  of  displacement  for  block-Toeplitz 
(Hankel)  and  Toeplitz  (Hankel)-block  matrices  slightly  extending  the  previous  definitions  of 
Kailath,  Kung  and  Morf  (1979)  and  Lev-Ari  and  Kailath  (1984).  An  important  property  of 
displacement  structure  is  that  it  is  preserved  under  Schur  complementations.  It  will  turn  out 
that  Toeplitz-(Hankd)-derived  (near-ToepUtz,  Toeplitz-Iike,  etc)  matrices  ate  perh^  best 


iv 


regaided  as  particular  Schur  complements  obtained  from  suitably  defined  block  matrices.  The 
displacement  structure  is  used  to  obtain  a  generalized  Schur  algorithm  for  the  fast  triangular 
and  orthogonal  factorizations  of  all  such  matrices,  and  well  structured  fast  solutions  of  the 
corresponding  exact  and  ovetdetermined  systems  of  linear  equations. 
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Chapter  1. 

Introduction. 

Fast  algorithms  for  factorizing  structured  matrices  that  include  Toeplitz,  Hankel  and  Van¬ 
dermonde  matrices  have  a  long  history.  The  earliest  known  fast  algorithm  is  probably  the 
Euclidean  algorithm,  which  has  recently  been  recognized  as  providing  a  fast  factorization  of 
Hankel  matrices  [14].  Factorizations  of  Hankel  matrices  also  underlie  the  criteria  and  the  fast 
algorithms  (due  to  Hermite  (18S6),  Hurwitz  (1895)  and  Routh  (1875))  for  checking  the  root 
distributions  of  a  polynomial  with  respect  to  imaginary  axis  (See  [45]  for  recent  generaliza¬ 
tions).  More  recently,  in  the  context  of  decoding  BCH  codes  Berlekamp  and  Massey  [5],  [48] 
gave  a  fast  algorithm  that  factorizes  the  inverse  of  a  Hankel  matrix  (See  also  [9],  [14],  [41], 
[54]). 

Fast  algorithms  for  Toeplitz  matrices  have  an  even  richer  history  [33-34].  Caratheodory 
(1911)  and  Toeplitz  (1911)  showed  that  the  positive-realness  of  certain  functions  is  equivalent 
to  the  positive-definiteness  of  certain  Toeplitz  matrices  [1].  Later,  Schur  (1917)  gave  a  fast 
algorithm  that  checks  the  positive-realness,  and  in  fact,  also  factorizes  close-to-Toeplitz 
matrices  [33-34],  [44],  [59].  The  Schur  algorithm  has  also  zppeaicA  in  dinerent  contexts  not¬ 
ably  in  seismic  deconvolution  problems  as  the  so-called  "layer-peelirig"  method  [11],  [35], 
[57],  in  orthogonal  filter  synthesis  [17],  [56]  and  in  checking  the  root  location  of  a  polynomial 
with  respect  to  the  um’t  circle  [33-34].  Baieiss  [4]  also  rediscovered  the  Schur  algorithm  as  a 
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fast  method  of  solving  Toeplitz  systems  of  equations. 

There  is  another  class  of  fast  algorithms  that  factorize  the  inverse  of  Toeplitz  matrices. 
They  include  the  recursions  of  the  Szego  orthogonal  polynomials  [62]  and  the  Levinson  algo¬ 
rithm  [46].  As  closely  related  results,  there  are  the  Gohberg-Semencul  formulas  [21-23]  and 
the  Trench  recursion  [63]. 

In  1972,  Kailath  [29]  [31]  developed  fast  algorithms  for  Kalman  filters  associated  with 
continuous-time  constant  parameter  state-space  models.  These  algorithms  replaced  the  non¬ 
linear  Riccati  differential  equations  of  the  Kalman  filter  with  another  set  of  nonlinear  equations 
that  he  dubbed  the  Chandrasekhar  equations  because  equations  of  somewhat  the  same  form  had 
been  developed  by  Chandrasekhar  and  Ambarzumian  for  solving  certain  Wiener-Hopf  integral 
equations  encountered  in  radiative  transfer  theory  [12],  [60].  The  discrete-time  versions  of 
these  results  were  developed  by  Kailath,  Morf  and  Sidhu  (see  [37],  [38]).  Various  extensions 
were  made  jointly  by  them  along  with  Ljung  and  Friedlander,  and  nice  interpretations  were 
found  in  terms  of  scattering  theory.  In  the  course  of  this  work,  it  became  clear  that  there  were 
close  relations  between  these  state-space  results  and  the  Levinson  and  Schur  algorithms  for 
solving  Toeplitz  equations  and  factoring  Toeplitz  matrices  (see  the  review  paper  [30],  [32], 
which  contains  many  references).  As  noted  therein,  Kailath  et.  al.  found  that  the  key  concept 
enabling  the  different  fast  algorithms  was  what  they  called  DISRACEMENT  STRUCTURE. 
This  is  in  many  ways  a  natural  generalization  of  Toeplitz  structure;  for  example,  the  inverse  of 
a  Toeplitz  matrix  is  not  in  general  Toeplitz,  but  all  matrices  aiKl  their  inverses  have  the  same 
displacement  rank.  Structured  matrices  (Toe[4itz,  Hankel,  Vandermonde,  products  of  such 
matrices  and  their  inverses,  Schur  complements  with  respect  to  various  entries,  etc)  aU  have 
low  displacement  rank.  The  complexity  of  numerical  computations  with  structured  matrices 
depends  upon  their  displacement  tank.  The  concept  of  disfdacemett  rardt  has  been  developed, 
extended  and  applied  in  many  ways  by  Kailath  and  his  students  and  colleagues  (Morf,  Sidhu, 


Dickinson,  Ljung,  Kung,  Friedlander,  Vcighese,  Vieira,  Levy,  Lee,  Lev-Aii,  Delosme,  Porat, 
Cioffi,  Bnickstein,  Citron,  Bistritz,  Rao,  Dewilde,  Dym,  DePrettere,  Pal)  -  see  the  review  paper 
[34]. 

As  the  reader  may  have  anticipated  from  the  above  discussion,  the  various  results  men¬ 
tioned  therein  have  been  developed  and  presented  using  a  variety  of  algebraic  and  analytic 
tools.  The  main  contribution  of  this  thesis  is  to  show  that  the  above  results,  and  several 
significant  generalizations,  can  be  obtained  in  a  very  constructive  (or  algorithmic)  way.  (At 
least,  we  have  so  far  shown  this  for  many  of  the  earlier  results;  with  more  work,  one  might 
anticipate  being  able  to  replace  "many”  by  "all".  -  see  the  remarks  in  the  last  chapter). 

The  generic  form  is  to  use  elementary  circular  and  hyperbolic  transformations  to  triangu- 
larize  a  certain  array  of  numbers  derived  from  the  displacement  representation  of  the  given 
structured  matrix;  the  desired  results  can  then  be  read  off  from  the  resulting  array.  This  new 
array  form  suggests  various  alternative  algorithms,  depending  upon  the  order  in  which  the 
transformations  are  applied;  these  variations  can  have  different  numerical  properties  and  lead  to 
different  implementations. 

The  basic  ideas  can  be  seen  from  the  simple  examples  in  the  next  sectitm.  However,  it 
may  be  noted  here  that  such  array  form  algorithms  were  introduced  into  least-squares  problems 
independently  by  Golub  [24]  and  by  Dyer  and  McReynolds  [18],  and  further  developed  by 
Bierman  [6],  among  others.  Generalizations  for  Riccati  and  Chandrasekhar  recursions  were 
introduced  under  the  name  square-root  algorithms  by  Morf  and  Kailath  [SO]. 

1.  Prototype  Examples. 

In  this  Chapter,  we  shall  ex[4ain  the  basic  idea  of  the  fast  algorithms  for  structured 
matrices.  To  provide  some  motivation,  we  shall  also  briefly  introduce  several  ptoblons  that 


involve  structured  matrices. 
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Simultaneous  Factorization  of  T  and  T~^. 
Let  r  be  a  Toeplitz  matrix 


T  = 


Co  Cl  •  C^_i 

Ct  Co  •  C,_2 


Co  —  L 

I 

C«-1  C,_2  •  Cq 

Then  it  is  easy  to  check  that  T  can  be  expressed  in  the  so-called  displacement  form  [33-36], 

T  =  L(x,)L^(xi)  -  L(x2)L^(xf^.  (1) 

where  L  (x)  denotes  the  lower-triangular  Toeplitz  matrix  with  the  first  column  x.  and 

Xi  =  [1.  Cl.  C2.  •  •  .  C,_i]^.  X2  =  [0,  Cl.  C2.  •  •  ,  c„.,f. 

Now  we  form  a  pre-array 


A  = 


L(xi)  L(x2) 
/  / 


and  post-multiply  A  with  any  J-orthogonal  matrix  0,  viz.,  one  that  satisfies 


0/0^  =y,  J  = 


In  O 
O 


that  will  yield  a  triangular  post-array 


Li  O 
U  Lj 
Then  it  turns  out  that 


A0  = 


s  A,  say. 


r  =LiL[. 


r-'  =  UU'^  =  LjLI- 
The  proof  is  veiy  simple.  We  just  compare  entries  in  the  identity 


A/A^  =  Aa/0^A^  =  2^A^. 

From  (2)  and  (3),  wc  have 


A/A^  = 


L (x,)Z.^(x,)  -  L  (x2)L^(x2)  L  (xi)  -  L (xj) 
L^(x,)  -  L^(x2)  O 


UL\  -L-jLI 


(2) 


(3) 


(4) 


(5a) 


(5b) 
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Now  equating  corresponding  entries  gives 

LyL[  =L(Xi)L^iXi)  -  L(X2)L^(X2) 

UL]  =  L^(x,)  -  L^ix^  =  / 

UU'^  -  LiLl  =  0. 

Therefore, 

T=LxL{. 

7"*  =  LfLi^  =  =  L2LI. 

Remark.  Determining  the  AR  parameters  of  a  raiKlom  process  requires  solving  a  special  Toe- 
plitz  systems  of  equations  called  the  Yule-Walker  equation  (or  Noimal  equatitm).  The  Yule- 
Walker  equation  has  a  special  right-side  vector,  viz.,  the  last  column  of  the  Toeplitz  matrix 
shifted  by  one  position.  One  can  easily  prove  that  the  solution  of  Yule-Walker  equation  is  the 
normalized  last  column  of  the  upper  triangular  matrix  U ,  where  T"’  =  UU^ .  Similarly, 
decoding  BCH  codes  requires  solving  the  Hankel  system  of  equations  whose  right-side  vector 
is  the  last  column  of  the  Hankel  matrix  shifted  by  one  position.  The  factorization  of  the  inverse 
of  the  Hankel  matrix  also  gives  the  solution  for  such  equations. 


We  still  need  to  show  how  to  find  such  a  matrix  6.  This  can  be  done  in  many  ways. 
One  is  as  a  sequence  of  hyperbolic  rotations. 


1 


Hijik) 


1 

(1  -  kY 


1  -k 

-k  I 


Ikl  <  1. 


where  k  is  called  the  reflection  coefficiera  or  Schur  parameter.  Let  us  consider  a  row  vector 
xT  g  ijiwi,  and  Hij{k),  where  k  -XjlXj,  the  ratio  of  the  jth  and  ith  elements.  With  this 
choice,  we  will  have 


[■'Xi-Xj-  ]Hij(,k)  a  t  •  •  Xj'  •  •  0  •  •  ]  ■  x'^’,  Xi'm  Vxj^-x/, 
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where  only  the  ith  and  jth  elements  of  x  are  altered.  Qeariy,  we  can  introduce  a  zero  at  any 
position  in  the  pre-array  by  post-multiplying  the  pre-array  with  an  appropriate  hyperbolic  rota- 
tioa 

We  shall  illustrate  the  annihilation  procedure  with  a  3  x  3  Toeplitz  matrix. 

1  Cl  C2 

T=  Cl  1  Cl 
C2  Cl  1 

We  first  form  a  pre-array  A,  and  post-multiply  A  by  // 2,4(0 1)  to  annihilate  an  element  c  1  in  the 
(1,2)  block,  resulting  in  Ai: 

A/f  2,4(0 1) 

Now,  we  annihilate  the  remaining  ci  in  the  (1,  2)  block  with  and  the  last  element  dj 

in  the  (1,  2)  block  with  Hj  ^id-j/di)  resulting  in  the  post-array  A: 
r 

1 

Cl  d, 

C2  d2  di  d-i  j 

1)  =  *  A2  A2/f 

di  d^  d^  dj 

di  <#4  1 

Note  that  the  Toeplitz  structure  allows  the  whole  subdiagonal  in  the  (1,  2)  block  to  be  nulled- 
out  with  hyperbolic  rotations  having  the  same  reflection  coefficient  ki.  This  suggests  that  we 
can  keep  only  two  columns  and  operate  on  them  as  shown  below; 


The  entries  in  X,  X,  and  Xi  completely  determine  the  matrices  of  Lj,  L2  and  U .  This  con¬ 
struct  can  be  regarded  as  a  combined  form  of  Schur  algorithm  [33-34],  [59]  (see  also  [4],  [42- 
44],  [55])  and  Levinson  algorithm  [46].  Wc  remark  also  on  the  simplicity  of  the  algorithm 
description  and  justification. 

By  inspecting  (6),  the  overall  operation  count  for  finding  L\,  L2  V  for  an  n  x  /i 
Toeplitz  matrix  T  can  be  seen  to  be 

22)41  =4/j2  +  0(„). 
i=i 

This  count  can  be  reduced  by  half  if  we  use  fast  rotations  (see  [13]  for  details). 

QR  factorization  ofT  [13]. 

Similar  ideas  can  be  used  for  finding  the  fast  QR  factorization  of  a  Toeplitz  matrix  T. 
First,  it  is  not  hard  to  check  that  T^T  and  T  have  the  displacement  forms, 

T^T  =  L(wi)L^(wi)  +  Hvr^LTiyr^  -  L{w3)L^(yr3)  -  L(W4)L^(W4), 

T  =  L(Vi)L^(w,)  +  L(y^L^(yri)  -  LCvjjL^Cws)  -  (yrif, 

where  the  vectors  w,  and  v,-  are  defined  as 


W|  =  T  *  t2» 

V,  =  -V3  *  t|/l It] II,  V2  =  Cl,  V4  =  0, 


L(w,)  L(W2)  L(W3)  Z,(W4) 

^=[£-(v,)  L(V2)  L(V3)  L(V4)J' 

and  post-multiply  F  with  any  J-orthogonal  matrix,  where 

In 

In 

y  =  -/, 

-In 

that  will  annihilate  the  (1,  2).  (1,  3)  and  (1,  4)  blocks  of  F,  and  triangularize  the  (1,  1)  block  of 


-9- 


known  fast  QR  factorization  algorithms. 

2.  Some  Examples  of  Structured  Matrices. 

How  about  the  factorization  of  non-Toeplitz  matrices  and  their  inverses?  In  many  signal 
processing  problems,  one  needs  to  solve  structured  matrix  equations  or  to  factorize  structured 
matrices  either  implicitly  or  explicitly  {33-34],  [42-44].  Some  examples  [33-34]  involving 
Toeplitz  or  Toeplitz-like  mauices  are  the  Schur-Cohn  test  (for  checking  if  a  polynomial  has  a 
root  outside  the  unit  disk),  orthogonal  filter  synthesis,  finding  AR  filter  [39]  and  certain  inverse 
scattering  problems  [11],  [3S].  On  the  other  hand,  certain  decoding  algorithms  for  BCH  codes 
[14],  [41]  requite  the  factorizafion  of  Hankel  matrices,  and  finding  interpolating  polynomials 
needed  to  solve  Vandermonde  matrix  equations.  In  this  section  we  shall  present  some  well- 
known  examples  bringing  in  structured  matrices. 

Example  1  Two  Dimensional  Filtering  with  finite  Samples. 

Let  [yij]  and  [ri,j}  be  uncorrelated  Gaussian  random  images,  arxl  suppose  we  observe 

yiJ  ~  ^ij  Bij- 

Let  the  image  planes  be  stationary,  i.e.. 

^  \yijykt  ]  =  ^  I'll,/ ^*,7  ]  =  fk-ij-j  • 

It  is  desired  to  find  the  estimator  based  cm  the  measurements  in  the  square  region  centered  at 

(».;■) 

iij  =  E[Xij  ly^j  :  i-m  6  i+m,  j-m  S  S  y'+m] 

i+«  J+m 

=  £  L  Okjykj-  (7) 

km-m  Irnj-m 

The  coefficients  in  (7)  can  be  found  firom  the  orthogonality  principle  [32], 

leading  to  the  following  block-ToepIitz,  Toeplitz-block  (or  douUy  Toeiditz)  system  of 
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equations. 


■  To  r, 

Ti  To 

T2m  ■ 
T'im-X 

^i-m 

S-m 

8-flf +i 

Tim  T-im-l 

^0 

ft- 

where 


(8) 


a*  =  [ak^^.  ■  •  .  e 

Sk  =  [gk.-m  ■  ■  ,gk^f  ^  _j-_ 

The  algorithms  to  be  described  in  Chapter  2  can  be  used  to  solve  (8)  with  0{m^)  operations. 

Example  2  Numerical  Solution  of  Inte^al  Equations  [19],  [47],  [52],  [58]. 

In  some  signal  processing  ^plications,  we  need  to  solve  certain  integral  equations,  such 
as  the  Wiener-Hopf  equation. 

r^(r+X)  =  £_rjy(/-t)/i(T)<ft,  r>0.  (9a) 

or  the  equation  that  arises  in  inverse  filtering  [27]  for  image  restoration, 

g(x,y)==j  j  h(x-a,  y-^)f(<x,  ^)dad^.  (9b) 

ja  VC 

The  equation  (9b)  is  usually  solved  numerically  after  discretization  using  some  quadrature  for¬ 
mula,  which  will  yield  a  matrix  equation.  It  is  known  that  solving  an  integral  equation  of  the 
form  (9)  is  inherently  an  ill-posed  (ill-conditioned)  problem.  For  simplicity  let  us  consider  the 
single  variable  case, 

g(x)  =  fKix.y)f(y)dy.  (10) 

Phillips  [52]  gives  a  good  discussion  of  the  difficulty  involved.  Following  Phillips,  let 

fm(y)  ■  sin(my).  Then  for  any  integrable  kernel  K(x,  y),  it  is  known  that 

Therefore,  an  infinitesimal  perturbation  in  g  can  cause  a  finite  perturbation  /„  in  /.  Also, 
one  would  expect  that  g^,  -*  0  (ya  m  -*  <>»)  faster  for  flat  snooth  kernels  than  for  sharply 
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peaked  kernels.  Let 

(11) 

be  the  matrix  equation  obtained  from  (10)  by  some  discretization  procedure.  If  we  refine  the 
discretization,  then  the  ill-conditioned  kernel  would  appear  as  an  ill-conditioned  matrix.  On  the 
other  hand,  if  we  use  a  large  mesh  widdi,  the  transformation  fnnn  the  integral  equation  to  the 
matrix  equation  can  be  ill-conditioned,  aixl  therefore  we  may  be  solving  a  (possibly  well- 
conditioned)  different  problem. 

One  way  to  try  to  overcome  this  difficulty  is  via  regularization.  The  ill-conditioning 
manifests  itself  as  an  oscillatory  solution  f{x).  Therefore,  it  is  reasonable  to  constrain  the 
solution  to  have  some  smoothness,  e.g.,  to  require  that 

£/<")(x)dx  ^  r, 

or  after  discretization 

(12) 

For  example  when  n  =  1  and  2, 

/('\x)A*  =/(x  -t- Ax)-/(x),  /(2>Ax  =/(x  +  Ai)-2/(x)-t-/(x  -  Ax). 

Therefore, 


1  -1 

1  -2  1 

1  -1 

1  -2  1 

L(*)  = 

\\ 

\\\ 

1  -I 

1  -2  1 

Now,  we  solve  the  following  constrained  minimization  i»oblem  rather  than  (1 1), 

mml  lATf  -  gl  I2  subject  to  t  I  I2  =  y.  (13) 

To  solve  (13),  we  form  the  Lagrange  function 


t  The  constraint  in  (12)  is  usually  active. 
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F(f.Ti)s  llA'f-glP  +  Ti(ll/-fll^-Y). 

and  from  =  0,  we  get 

of 

+TlL^L)f  =  A:^g. 

This  can  be  recognized  as  seeking  the  least  squares  solution  to  the  linear  system 


K 

i\L 


f  = 


(14a) 


(14b) 


The  Lagrange  multiplier  q  is  usually  chosen  by  trial  and  error.  If  L  is  the  identity  matrix,  then 
this  method  reduces  to  the  so-called  "damped  least  squares  method”  (also  see  [25,  pp.  145, 
P6.1-9]  for  computing  approximate  pseudo  inverses  with  this  technique). 


For  the  convolutional  kernel,  the  matrix  K  is  Toeplitz,  and  the  matrix 


K 

r\L 


is  close-to- 


Toeplitz.  The  algorithms  to  be  discussed  in  Qiapter  2  can  be  used  to  solve  (14)  in  O(n^) 
operations. 


Example  3  Maximum  Likelihood  Estimation  of  ARMA  Parameters  [3],  [40,  pp.  125-127]. 

Let  [y(r)}  and  (e(r))  satisfy  the  difference  equation 

y(0  +  aLy(t-l)  +  •  •  +  Opyit-p)  =  e{t)  +  c,e(r-l)  -!-•+  +  L,e{t-q)  (15) 

where  e(r)  is  a  zero-mean  white  Gaussian  process  with  variance  o^. 

A(z)  a  2^ -»■  fliz'’”*  +  •  •  +  Op  ^  0  for  lrl<l, 

C(r)  a  z’ +  Ciz^“* -t- •  •  +  ^  0  for  lzl<l, 

and  {a,  },  (c/}  are  unknown  deterministic  constants.  Given  measurements 

Jn  y(0),  y(-l),  •  • ,  y(-p)f 

it  is  desired  to  find  an  estimator  for  and 

0  =  [01.  02.  •  • .  0,r  =  [oi.  •  • ,  ci.  •  • .  . 

Note  that 


P(Jn)  = 
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Therefore, 


N 


P(yN)  =  (nP(y(Oly,-i)lPO'(0).  -.yi-p)) 

/=i 

Because  {e(r))  is  Gaussian,  P(y(^)ly/_i)  is  also  Gaussian,  and 


(16) 


P(y(Oiy,-i)  = 


yiKp- 


rCXp 


Ly(r)-y(Or 


2p2 


(17) 


where 


3?(0  =  £(y(0ly,-il  (18) 

p2  =  £[(y(r)-y(r))(y(/)-y(r))]  =  E[e(t)e(t)]  = 

First,  note  that  the  probability  density  P(y(0),  ■  ■  ,  y(p))  is  a  complicated  function  of 

{y(0),  •  •  ,  y(-p)),  0  and  a,  and  therefore,  it  is  difficult  to  find  the  maximum  likelihood  esti¬ 
mate  of  0  using  P(yN)  in  (16).  Instead,  we  maximize  the  conditional  probability  density  func¬ 
tion 


P[y/vl(y(0).  •  •  .y(-p))l  =  np(y(r)ly,.,).  (19) 

1=1 

To  use  (19)  for  maximum  likelihood  estimation,  we  need  to  express  in  (17)  in  terms  of  0. 
If  we  assume  that  we  know  {c(r)},  then 

:;(0  =  £[y(f)ly,-,l 

=  aiy(r-l)  +  •  +  apy(t-p)  +  cie(r-l)  +  •  ■  +  c^eit-q). 

However,  we  do  not  know  {c(f)),  so  we  qiproximate  e(f)  with  c(t)  that  is  computed  recur¬ 
sively  by 


e(0  =  y(t)  +  aiy(t-l)  +  •  •  +  a^yit-p)  -  c,e(r-l) - (20) 

With  this  ^roximation  note  that 

y(.t)-S{t)  =  tit). 

From  (19),  the  (conditional)  likelihood  function  L  is  given  by 

-logL  =  -3  +  AHogo  +  4log2ji. 

2<r  kmi  2 

Note  that  maximization  of  L  with  respect  to  the  parameteis  o  and  0  can  be  done  separately. 
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MaximizaUon  of  L  with  respect  to  0  is  equivalent  to  minimization  of 


With  the  following  notations. 


0/(0)  =  [  • 
du| 


notice  that 


]  6 


s 

DV(Q)  =  £e(/)  De(r)  =  B^b. 

/«! 

,  A'  ^  yv 

O  V(0)  =  Y.D£(t)D^e(t)  +  j;,e(t)Dh(t)  = 

'=1  i»i 

where 


[Oe(l)f 

e(l) 

B  = 

[Oe(2)]^ 

,  b  = 

e(2) 

[Oe(Ar)I^ 

• 

t(N) 

Using  (20),  one  can  easily  verify  that 


Mt) 

dOi 

Ml 

dci 


=  y(t-i)  -  -  cMf-2)  _  .  . 

aOj  doi 

dci  da 


de{t~Q) 
’  da, 

M-?) 

’  dc, 


If  we  define  the  matrices  Y  and  E, 


y(.0)  y(-I)  •  y(i_p)  ■ 

e(0) 

e(-I) 

Y  m 

^(l)  >-(0)  •  y(2-p) 

E  m- 

e(l) 

m 

y(m-2)  •  y(jn-p) 

1 

e(m-l)  e(m-2) 

then  the  matrix  B  in  (21)  has  the  form 


•  e(l^) 

•  e(2-^) 

•  e(m-q) 
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1 

Cl  1 


B  =C-\Y  £]. 


1 


Now  the  parameter  0  that  minimizes  V(0)  can  be  obtained  iteratively  by  using  Newton’s 


method. 


(Conditional)  Maximum  Likelihood  Estimate 
Set  initial  estimate  0^0; 
repeat 

Compute  e(r)  with  (20): 

Solve  5s  =  b; 

0  ;=  0  +  s 
until  convergence. 


The  matrices  C~^Y  or  C~'£  are  not  Toeplitz,  but  they  are  closc-to-Toeplitz. 


Example  4  Instrumental  Variable  Method  [20]. 

Consider  an  ARMA  model  as  in  (15),  and  let  0^  =  [oi,  •  •  ,  a^]^.  The  parameter  %  can 
be  obtained  by  solving 

rjri0  =  rjy,  (22) 

where 


y(<?) 

y(4+i)  • 

y(q+/V) 

y(0) 

y(-l)  • 

y(l-p) 

y(<7-i) 

y(?)  • 

y(<7+iv+i) 

y(i) 

y(0)  ■ 

yil-p) 

* 

' 

.  r,= 

• 

y{q-p+\) 

• 

•  . 

.y(^-i) 

• 

y(N^) 

y  =  [y(i).y(2),-.y(N) 

Again  the  matrix  T^Tx  is  close-to-ToepIitz,  and  the  fast  algorithms  in  this  thesis  can  be  used  to 


solve  (22). 
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Example  5  The  Euclidean  algorithm  and  Hankel  Matrix  Factorization,  t 

The  Euclidean  algorithm  that  tests  if  two  polynomials  are  relatively  prime  or  not  in  fact 
factorizes  Hankel  matrices.  To  see  this,  let  us  consider  the  3  x  3  Hankel  matrix 


5  3  2 
//  =  3  2  1 
.2  1  4 

We  define  the  polynomial 


p(x)  =  5x*  +  3jt^  +  2x^  +  X  +  4. 

whose  coefficients  are  the  1st  column  and  the  1st  row  of  H.  Also  let 


q(.x)  =  x\ 

We  repeatedly  divide  as  follows. 


5xV3x^+2x^+x44  Ix^ 


5  5  5  5 

-—x*-—x^-—x^-~x 
5  5  5  5 


|5x'+  3i’+  +  1+4 


t  The  division  alforithm  shown  here  is  slightly  different  from  the  classical  Euclidean  algorithm.  However, 
clanical  Euclidean  algorithm  can  be  also  used  for  this  purpose. 
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9 

5" 


3  3  3 


3423  1  2  4 

— —X  — X - X - X 

5  5  5  5 

_1.4.  3  3_51.  2  36 

-x^  +  IQ*^-  &r 


-x^+lOx^-ix 


-—x^+  ^x^-^x+4 
3  3  3 

1  3,  10  2  8 

—rx  +-:rx  ~ 

3  3  3 


-^x^-  3:t+4 


(22b) 


-3x^-3x+4 


3  2  3 

-X-  X‘‘+-rX 
_ 4 

4 

Now  consider  the  truncated  divisor  polynomials  in  (22), 


(22c) 


p^x)  ^  5xU3xh2x^.  Pi(x)*^x^-hjx^  P2(x)m-3x^ 
and  the  highest  degree  terais  of  the  dividend  polynomials  in  (22), 


q<j(x)mx^.  qjjixym-x^. 

If  we  multiply  pj(x)  by  the  coefficients  of  9,-(x),  then  it  turns  out  the  resulting  polynomials 
form  the  columns  of  Oiolesky  factors  of  H.  Namely. 


■5 

■  1/5 

’53  2  ■ 

H  = 

3  1/5 

5 

1/5  -1/5 

2  -1/5  3. 

1/3. 

3 

The  computational  effort  is  Oin\  Further  discussion  of  the  proUem  will  be  given  in  Oiapter 
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Example  6  Polynomial  Interpolation  [IS.  pp.  38-46]. 

Given  n  distinct  points  Xj,  x^,  ,  x,,  we  can  find  a  polynomial 

p  (x)  =  a„_ix"~'  +  d,_2x"“^  +  •  •  +  aix  +  ao  (23) 

of  degree  at  most  n-1  by  solving 


1 

1 


X,  •  X?  ‘ 

V  .  -r"-! 

X2  X2 


^0 


>2 


1 


On-l 


ym 


yi  ^p(xi). 


Instead  of  finding  the  coefficients  of  the  polynomial  in  (23),  we  could  as  well  find  the 
coefficients  of  the  so-called  Newton  form, 


p(x)  =  Co  +  c,(x-xo)  +  C2(x-xo)(x-x,)  +  •  •  +  c,(x-Xo)  •  •  (x-x,_,). 

One  can  check  that  the  coefficient  c^  depends  only  on  the  values  of  fix)  at  the  points 

xq.  xj,  •  •  ,  X*;  it  is  called  the  kth  divided  difference  of  fix)  at  the  points  Xq,  Xj,  •  •  ,  x*  and  is 

denoted  by  /  [xq,  •  • .  x*  ].  Also  one  can  check  that 


flXQ,  -  ,x*]  = 


f[xi,  •  •  .Xtl  -/[Xq,  •  •  .Xt,i] 

X*  -Xo 


(24) 


After  finding  the  Newton  fonn,  we  can  recursively  compute  the  coefficients  a,-  in  (23),  if  we 
wish,  by  using  the  identity 


Co  +  Ci(X-X(j)  +  C2(X-Xo)(x-Xi)  +  CjiX-X(^X-Xi)iX-X:) 

=  Co  +  (c  1  +  (C2  +  c^x-x^Xx-x  i))(x-xo).  (25) 

The  computations  in  (24)  and  (25)  need  Oin^)  operations,  and  in  faa  factorize  the  Vander¬ 
monde  system  of  equations  [8].  The  factorization  algorithms  (for  Vandermonde  matrices)  in 
Chapter  4  are  closely  related  to  this  method. 


3.  Outline  of  the  Thesis. 

The  idea  in  Sec  1  is  extended  in  Chapter  2  to  find  triangular  factorizations  and  QR  fac¬ 
torizations  of  bIock-Toq>Iitz  and  Toqriitz-block  matrices.  In  Chapter  3,  we  slightly  change  the 
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pre-array  in  Sec  1  to  constiuctively  obtain  ceitain  well-known  Toeplitz  inverse  expressions 
called  the  Gohberg-Semencul  formulas.  We  also  generalize  the  Gohbeig-Semencul  formulas 
[22-23]  to  a  large  class  of  matrices.  Some  related  results  are  in  [21],  [26],  [39],  [42-43],  [63]. 
In  Chapter  4,  we  show  how  to  factorize  close-to-Hankel  matritxs  such  as  Hankel,  block- 
Hankel,  Hankel-block  and  Vandermonde  matrices.  Some  previous  results  are  [S],  [9],  [41], 

[48] ,  [53-54],  [61].  In  Chapter  5,  we  present  a  divide-and-conquer  ai^roaches  for  finding 
solutions  of  block-Toeplitz  and  Toeplitz-block  matrices:  for  related  results,  see  [2],  [10],  [16], 

[49] ,  [51].  Some  concluding  remarks  are  offered  in  Chapter  6.  Each  chapter  is  self-contained, 
and  therefore,  readers  can  essentially  read  them  in  any  order. 
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Chapter  2. 

Generalized  Displacement  Structure  for  Block-Toeplitz, 
Toeplitz-block,  and  Toeplitz-derived  Matrices 


Abstract 

The  concept  of  displacement  stnicture  has  been  used  to  solve  several  problems  connected 
with  Toeplitz  matrices  and  with  mattices  obtained  in  some  way  from  Toeplitz  matrices  (e.g.  by 
combinations  of  multiplication,  inversion  and  faaorization).  Matrices  of  the  latter  type  will  be 
caUed  Toeplitz-derived  (or  Toeplitz-like).  In  this  chapter  we  shall  introduce  a  generalized 
definition  of  displacement  for  block-Toeplitz  and  Toeplitz-block  matrices.  It  wiU  turn  out  that 
Toeplitz-derived  matrices  are  perhaps  best  regarded  as  particular  Schur  complements  obtained 
fiom  suitably  defined  block  matrices.  The  displacement  structure  will  be  used  to  obtain  a  gen¬ 
eralized  Schur  algorithm  for  the  fast  triangular  and  orthogonal  factorizations  of  all  such 
matrices,  and  well  structured  fast  solutions  of  the  corre^nding  exact  arxl  overdetermined  sys¬ 
tems  of  linear  equations. 
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1.  Introduction. 

In  multichannel  signal  processing,  system  identification  and  image  processing  applica¬ 
tions,  one  encounters  various  fonns  of  structured  matrices.  One  interesting  family  consists  of 
matrices  having  a  block-Toeplitz  form 


Bq  fl_| 
B\  Bo 


Bm-2 

or  a  Toeplitz-block  form 


^-A^+l 

fi-jv+2 


Bii  rectangular  matrices. 


(la) 


A-}  = 


7*1.2  • 

Txj, 

T'z.j 

7” 2.2  ■ 

Jm.i 

7##/r 

Tiji  rectangular  Toeplitz  matrices 


(lb) 


or  often  as  Schur  complements  with  respect  to  various  entries  in  A  j  or  A  2  (see  the  examples  in 
Sec  4).  Often  we  call  the  matrix  >4 1  m  M  xff  block-Toeplitz  array,  and  the  matrix  A2  an 
M  X  N  Toeplitz-block  array;  the  matrices  obtained  as  Schur  complements  are  often  not  Toe¬ 
plitz  at  all,  but  have  been  called  near-Toeplitz,  or  close-to-Toeplitz,  or  Toeplitz-like  or 
Toeplitz-derivcd  mauices. 

For  such  A ,  we  shall  show  how  to  obtain  fast  triangular  factorization  A  =LU,  and  fast 
QR  factorization.  A  =  QR,  which  among  other  things  will  also  give  us  fast  nicely  structured 
methods  for  solving  exactly-determined  systems  of  equations. 


v4x  =  b,  i4eR"’*,  A  is  strongly  nonsingular  (2) 

and  also  over-determined  systems  of  equations. 


i4  x  =  b.  Am  m  kn,  has  ftUl  column  rank.  (3) 

Our  results  will  be  based  on  a  generalization  of  the  concept  of  displacemettt  structure  used  in 

earlier  work  (see  e.g.,  [13-17]).  Besides  enabling  us  to  solve  several  new  problems,  this  gen¬ 
eralized  concept  will  also  provide  a  new  and  simpler  approach  to  many  of  the  problems  studied 
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in  [13-17]  and  [4].  However,  first  we  briefly  review  earlier  approaches  and  results. 

For  a  square  block-Toeplitz  matrix  A  |  e  ,  with  square  blocks  B;  e  ,  there  exist 
several  fast  triangular  factorization  algorithms  such  as  the  Bareiss  algorithm  [1],  the  multichan¬ 
nel  Levinson  algorithm  [18]  and  the  Schur  algorithm  [13-14],  [20],  all  of  which  require  matrix 
(of  the  block  size,  r  x  r)  operations.  Our  approach  will  treat  block-Toeplitz  matrices  in  essen¬ 
tially  the  same  way  as  scalar  Toeplitz  matrices,  and  in  particular  will  use  only  elementary 
scalar  operations;  the  absence  of  matrix  operations  such  as  inversion  will  simplify  the  design  of 
dedicated  hardware  implementations.  For  a  square  Tocplitz-block  matrix  A2  e  R"’“  with 
Tij  e  the  previous  approaches  were  first  to  transform  A  2  into  a  block-Toeplitz  matrix 

by  pre-  and  post-multiplication  with  permutation  matrices,  and  then  apply  an  algorithm  for 
square  block-Toeplitz  matrix  to  get  a  row-  and  column-  permuted  triangular  factorization  of 
A 2;  there  is  clearly  a  difficulty  with  this  approach  when  Also  the  permuted  matrix 

might  not  be  strongly  nonsingular.  Our  approach  will  not  have  this  problem  because  it  directly 
factorizes  A  2  without  permutations.  Finally,  for  matrices  obtained  via  Schur  complementation, 
the  concept  of  displacement  structure  (see  e.g.  [4],  [14-17])  has  been  used  to  obtain  a  number 
of  fast  algorithms;  in  particular,  several  algorithms  have  recently  appeared  [2],  [4],  [8],  [21]  for 
the  orthogonalization  of  scalar  Toeplitz  matrices;  our  new  approach  also  provides  a  generalized 
unification  of  these  algorithms. 

Several  illustrations  and  applications  of  our  approach  will  be  given  in  Sec  4.  Our  choice 
here  was  made  in  part  to  relate  to  examples  and  problems  that  are  studied,  generally  in 
diHerent-ways,  in  other  chapters  of  this  thesis. 

Our  generalized  definition  of  displacement  structure  is  presented  in  Sec  2.  A  correspond¬ 
ingly  generalized  Schur  algorithm  for  matrix  factorization  is  derived  in  Sec  3.  As  just  men¬ 
tioned,  Sec  4  contains  various  applications.  Finally  some  computational  aspects  are  elaborated 
in  Sec  S;  in  particular  we  may  ix>te  the  introduction  of  minors,  which  include  as  special  cases 
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the  circular  (Givens)  and  hyperbolic  rotations  as  well  as  the  well-known  elementary  (or  elimi¬ 
nation)  matrices. 

2.  Displacement  of  Matrices. 

Let  A  e  R'"^  be  a  given  matrix,  and  let  and  F*  be  strictly  lower  triangular 
matrices.  The  matrix 

sA  -  F^AF'^'^  (4) 

will  be  called  the  displacement  of  A  with  respect  to  the  ^placement  operators  [F^ ,  F*}. 

Assume  that 

rankV^pf  j,tyA  =  a. 

Any  matrix  pair  {X,  Y]  such  that 

X  3(Xi,  X2,  ...Xa],  K  3[yi,  yj,  .  .  ,y„] 

wiU  be  called  a  generator  of  A  (with  respect  to  {F-^.  F* )).  The  number  a  will  be  called  the 
length  of  the  generator  (with  respect  to  {F^ ,  F'’]).  A  generator  of  A  with  the  minimal  possi¬ 
ble  length  will  be  called  a  mininuU  generator.  The  length  of  the  minimal  generator  of  A  (i.e., 
rank(V(^/^*y4))  will  be  called  the  displacement  rank  of  A  (with  respect  to  {F^,  F* )),  and 

denoted  as 

If  {;if .  y  1  is  a  generator  of  A  e  R"”*  with  respect  to  [F^ ,  F* )),  then  for  any  nonsingu¬ 
lar  matrix  5  e  the  matrix  pair  {XS,  YS~^ )  is  also  a  generator  of  A  because 

=;iry^  =xss-^y^.  (5) 

Let  {X,  y)  be  a  generator  of  a  matrix  with  respect  to  strictly  lower  triangular  displace¬ 
ment  operators  (F^ ,  F* }.  We  say  that  a  generator  is  proper  (with  respect  to  the  column  j)  if, 
for  a  certain  /,  all  the  elements  in  the  ith  row  of  X  aixl  above,  except  for  the  element 
are  zero,  and  all  elements  in  the  ith  tow  of  y  atxl  above,  except  the  element  [yi,j,  ate  zero: 
for  example. 
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'  ■ 

■ 

0  0*00 

0  •  0  ♦  0  •  0 

X  = 

*  ,  * 

.  T  = 

*.***.* 

*  .  4i  «  *  .  * 

Often  we  shall  denote  a  proper  generator  as  {Xp,  Yp).  If  {X,  Y]  is  not  proper,  then  by  choos¬ 
ing  appropriate  5.  we  can  obtain  a  proper  generator  (XiS,  under  certain  conditions  on 

the  matrix  A  (see  Sec  S). 

Note  that  the  displacement  of  a  symmetric  matrix  i4  can  be  written  as  ^^p/j?tyA  =  XIX^ 
where  £  is  a  diagonal  matrix  with  I  or  -1  along  the  main  diagonal;  we  shall  say  that  i4  has  a 
symmetric  generator.  {^,  ^I}. 

As  an  example,  for  a  square  Toeplitz  matrix  T  =  (r,_y)  e  R"^ 


to  i-t  ' 

to  o’ 

."O  1 

=  X1X^.  X  = 

tl  t, 

/■-I 

2  = 

1  0 

0  -1 
•  . 

Choice  of  Displacement  Operators. 

Let  {Af ,  K )  be  a  generator  of  length  a  of  A  with  respect  to  and  F* .  If  the  matrix- 
vector  multiplicatitxis  F'^ v  and  F^v  takes  f(n)  operations,  then  our  algorithm  in  Sec  3  will 
need  0(anf(n))  operations.  Therefore,  our  objective  is  to  choose  the  "simplest"  or  sparse  (to 
make  f(n)  smaU)  strictly  lower  triangular  matrices  F^  and  F^  that  also  make  a  as  small  as 
possible.  For  a  scalar  n  x  n  Toeplitz  matrix,  a  natural  choice  of  di^lacement  operator  is  tlK 
simple  n  X  n  shift  matrix,  Z,.  with  I’s  along  the  first  sub-diagtmal,  and  O’s  elsewhere. 

For  an  M  xN  block-Toeplitz  array  with  r  xs  blodcs,  the  following  choice  of  displace¬ 
ment  operators  gives  the  smallest  a, 

F/=z^,. 
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where  2*.  is  a  block  shift  matrix,  i.c.,  a  /t  x  4  anay  with  r  x  r  identity  matrices  on  the  1st 
block  subdiagonal  and  zeros  elsewhere. 

For  block-Toeplitz  or  Toeplitz-block  matrices,  it  is  straight  forward  to  obtain  generators 
from  the  displacements  by  inspection  (see  Appoidix  4  for  closed  forms),  as  we  shall  illustrate 
in  several  examples. 


Example  2.1.  For  the  following  block  ToepUtz  matrix  A ,  called  the  "Hurwitz  matrix",  we  can 
choose  ~  and  F*‘  =2  to  get  a  rank-2  displacement  , 


^3  ^5  4I7 

0|  O3  O5  O7 

Oq  O2  04  O^  • 

Oq  O2  O4  Og  • 

A  = 

0  Ox  as  03  ■ 

0 

0  Oo  02  04  • 

Note  that  =  XY^,  where 


X  a 


1  0  0  •  ■ 
0  1  0  •  • 


ax  03  as 
00  02  O4 


□ 


For  an  M  xN  Toeplitz-block  array  with  m,-  x  nj  Toeplitz  matrices,  we  shall  use  the  diS' 


placement  operators. 


At  N 

Ff  =  ©2^  e  F*  =  ©2^  €  R"’^, 

M 

where  ©  denotes  the  concatenated  direct  sum,  i.e.,  A  ©R 


A  O 
O  B 


Example  2.2  For  die  following  Toeplitz-block  matrix  A,  which  arises  in  ARMA  system 
identification  problems  [9],  we  can  choose  and  F^  =  Z2©23  to  obtain  a  rank-3  dis¬ 

placement. 
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Po  P-i  Yo  Y-i  Y-2 

Po  P-i  Yo  Y-i  Y-2 

Pi  Po  Yi  Yo  Y-i 

Pi  Yi 

A  = 

p2  Pi  Y2  Yi  Yo 

P2  Y2 

P3  P2  Y3  Yz  Yl 

P3  Y3 

Example  23.  Consider  the  matrix  M , 


M  = 


A  O 
O  I 
I  O 


If  A  is  an  M  x  N  block-Toeplitt  array  with  r  x  s  blocks,  we  could  choose 


=  F**  ®2sr . 


If  A  is  a  Toepiitz-biock  array,  for  example,  A  - 


Ti 

T2 


,  where  T i  e  and  e 


then  we  could  choose 


Remark  2.1.  One  might  check  that  the  displacement  operators  for  a  block-Toeplitz  matrix  and 
a  Toeplitz-block  matrix  are  related  by 

Z£. 

I«1 

where  P  is  the  permutation  matrix  that  transforms  the  block-Toeplitz  matrix  into  the  Toeplitz- 
block  matrix  by  pie  and  post-multiplication,  vice  versa. 


3.  Generalized  Schur  Algorithm  and  Partial  Triangularization. 

A  fundamental  method  for  triangular  matrix  factorization  is  the  so-called  Schur  reduction 
process,  which  computes  Schur  complements  of  leading  submatiices  iteratively.  Lev-Ari  and 
Kailath  [16-17]  realized  that  the  classical  Schur  algorithm  amounts  to  Schur  reduction,  and 
gave  several  important  generalizations  including  one  for  Hankel  matrices,  hi  the  rest  of  this 
chapter,  we  shall  further  elaborate  the  ideas  in  [17]. 
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Fast  Schur  Reduction  using  Displacement  Structure. 

Our  fast  algorithms  will  be  based  on  the  following  theorem. 


Theorem  3.1  Let  be  a  generator  of  a  rectangular  matrix  e  R"**  with  respect 

to  [F^ ,  F* }.  Also  assume  that  is  proper  with  respect  to  a  particular  (pivoting) 

column,  which  we  shall  index  as  "pvt".  If  we  denote  the  columns  of  and  by 


x;‘>  =  [xP,  •  • .  xW.  .  x<‘>].  f;‘>  =  [yP.  yP], 

then  the  matrix  A  defined  by 

has  null  first  column  and  row,  and  has  a  generator  {X®,  K®),  with  respect  to  (F^ ,  F*)  of 
the  form 


x(^>  =  [xp).  •.  •.  F/x^J).  ••.*«}.  •.  •.  •  • .  y^'"]. 


Proof 


A®  -  F/A®F‘^  =  [A(‘)  -  x«>y^)^]  -  F/[A»>  - 

=  A(‘)-f/a<'>F*^  -xWy^>^  +  Fx«)y;j>V^^ 

=  x(')r<'>"-x^>y^>"-HF*;j)y;j)V^" 

_  a)ya)r 

The  first  column  and  row  of  A^^  are  null  because 


A  ®e,  =  ]ei  =  0,  e?A  =  ef[X'^V®’’  ]  =  0, 

where  we  have  used  the  facts  that  F^  and  F^  are  strictly  lower  triangular  and  (X^*\  is 

proper.  □ 

Before  presenting  the  generalized  Schur  algorithm,  we  assume  that  we  have  a  procedure 
called  MakeProper  that  can  convert  a  given  generator  {X,  X )  of  A  e  R"**^  into  a  proper  gen¬ 
erator  (X^,  Xp )  (whenever  A  has  a  proper  generator);  this  can  be  done  with  0((xm)  operations 
as  will  be  shown  in  Sec  S. 

By  applying  the  previous  theomn  using  such  a  proper  generator  we  can  obtain  a  (possi- 


-  33- 


bly  non-proper)  generator  of  A®.  By  repeating  this  process,  r  times,  we  shall  generate  the 
matrices 

A(r)  _  Air-n  _  (r-l)T 


A®)  =  A<‘>-xWy»>[. 

It  turns  out  that  this  (XtKess  gives  a  partial  triangular  factorization  of  this  follows  by 
noting  that 


‘pVlj  ^/FW^ 


Remark  3.1.  If  we  define  A^'^  = 


v(l)T., 

y^i 


.(OT 


+  A^'’^'>,  A('’>: 


o  o 

Q  5(r^l) 


B  D 
C  E 


then  it  is  easy  to  check  that  =  £  -  CB~^D . 

The  matrix  is  called  the  Schur  complement  of  £  in  A^*^  Notice  that  the  above  process 
also  gives  a  generator  of 


Remark  3.2.  The  above  r  step  partial  triangularization  breaks  down  if  and  ordy  if  there  is  a 
singular  leading  principal  submatrix  of  order  less  than  or  equal  to  r;  we  shaU  assume  that  this 
is  not  so. 

The  above  procedure  can  be  summarized  in  the  following  algorithm,  which  we  shall  call 
a  generalized  Schur  algorithm. 


Algorithm  (Generalized  Schur  Algorithm) 

/apitf;  A  generator  {X,  K)  of  A  «  R"***  with  respect  to  [F^ ,  £*). 
Output:  (i)  Partial  triangular  factors  L  e  R”*’^  and  1/  e  R'’*”  of  A . 
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(ii)  A  generator  {X,  K)  of  the  Schur  complement  of  the  r  x  r  leading  principal  subma- 
uix  of  A , 

Procedure  GeneralizedSchur 
begin 

for  k  :=  1  to  r  do  begin 
MakePioper, 

The  k\h  column  of  L  .•=  The  /kth  row  of  U  :=  y^; 

Replace  with Refdace y^^  with  F^y^w: 
end 

return  (L,  U,  (X.  K)); 

end. 

Note  that  the  above  procedure  needs  0(amr)  operations,  where  a  is  the  length  of  the 
given  generator,  assuming  that  MakeProper  takes  0(am)  operations  (see  Sec  S). 


Example  3.1  Triangularization  of  blodc-TocpIitz  or  Toeplitz-block  matrices. 

At  trivial  examples  we  can  triangularue  block-ToepUn  matrices  or  Toeplitz-block 
matrices  simply  by  completing  the  above  generalized  Schur  algorithm.  Note  that  the  multipli¬ 
cations  f  '  Xpy,  and  F^’jpy,  amount  to  shifting  down  "segments”  of  Xp^  and  jp^ . 


Example  3 JZ  Simultaneous  Factorization  of  a  Toeplitz  nuitrix  and  its  Inverse  [4]. 
Consider  the  matrix 


T  I 
I  O  * 


r  =  (r,_pe  to=l 


which  has  the  following  symmetric  generator. 


(6) 


X  = 


1 

0 


h 

h 


1  0  •  0  ' 

1  0  •  0  • 


r= 


1  0 
0  -1 


After  performing  n  steps  of  partial  triangular  factorization  using  the  generalized  Schur  algo¬ 


rithm,  we  shall  have 


A  K 


L 

U 


I  L^,  1  + 


O  O 

o  s 


Now,  one  can  check  by  comparing  the  entries  of  A  in  (6)  and  (7),  that 


(7) 
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T=UJ,  T-^  =  UU^. 

Remark  33  Recall  that  the  classical  Schur  algorithm  [20]  gives  only  the  factorization, 
T  =uJ ,  whereas  the  Levinson  algorithm  gives  the  factorization,  7^'  =  UU^ .  If  one  only 
needs  the  factorization  of  r~'  the  above  method  is  slower  than  the  Levinstxi  algorithm  [18];  a 
derivation  of  the  Levinson  algorithm  iiom  the  generalized  Schur  algorithm  can  be  found  in 
Appendix  3. 


Example  33  Orthogonalization  of  a  fully  windowed  ToepUtz  matrix. 

LetT  -  Ui^j)  e  R'"’“ ,  m  >  n  be  a  fully  windowed  ToepUtz  matrix,  i.e.. 


tj_y  =0,  if  j  >  i,  or  i  > 

Then  it  is  easy  to  check  that  C  m  T^T  is  also  an  (unwindawed)  ToepUtz  matrix.  Now  consider 
the  following  matrix 


T'T  •j*  2*7* 

r  o  ' 


for  which  it  can  be  checked  that  a  generator  of  A  is 


(8) 


<^0  h  i\  '  *m-H  0  0^^ 

0  c,  ■  c,_,  to  f,  •  r„_,  0  •  0 
After  performing  n  steps  of  partial  triangular  factorization  using  the  generalized  Schur  algo¬ 


X  = 


e  E  = 


1  0 
0  -1 


rithm,  we  shall  have 


[R,  (2M  + 


O  O 
O  S 


From  (9),  one  can  easily  see  that 


T^T^R^R,  T  =  QR 

so  that  Q  is  orthogonal  because  RTQ^QR  =  R^R. 


(9) 


Remark  3.4  Recall  that  the  (fixed  AR)  lattice  filter  operates  on  a  (data)  sequence  (tj )  and  per¬ 
forms  orthogonalization  to  get  the  prediction  errors  without  a  knowledge  of  the  covariance 
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matrix  (i.e.,  T^T)  of  the  data  sequence.  The  lattice  filter  is  again  a  Levinson-like  version  of 
the  above  method. 

4.  Applications  to  non-Toeplitz  matrices. 

Now  we  shall  consider  various  extended  matrices  A#.  By  applying  the  generalized  Schur 
algorithm  in  Sec  3  to  judiciously  dxrsen  extended  matrices,  we  can  obtain  interestit^  results 
including  QR  factorizations  of  block-Toeplitz  or  Toeplitz-block  matrices. 

Generators  of  extended  matrices  in  this  section  can  be  also  easily  found  by  inspection. 
For  closed  forms  of  various  generators,  see  Appendix  4. 


A.  QR  factorization. 

Let  A  e  be  a  block-Toeplitz  or  a  Toeplitz-block  matrix,  and  let  us  define  the  block 
matrix. 


‘  -I  A  o' 

Ms  A^  O  A^  .  (10) 

.  O  A  I  , 

If  we  apply  the  generalized  Schur  algorithm  to  (10)  then  after  the  mth  step  we  shall  have  a 
generator  t  of 


A^A  A^' 

A  I  • 

After  another  n-steps  of  partial  triangulaiization,  we  shall  have 


(11) 


A^A  A^ 
A  / 


[RQ^]  + 


Now.  one  can  check  that  the  matrices  Q  and  R  in  (13)  satisfy 


(12) 


A=QR.  Q^Q^I. 

i.e.,  we  obtained  the  QR  factorization  of  A.  This  procedure  will  need  0(mn)  flops. 


t  One  can  suit  with  a  f/eamiot  of  ihe  extended  matrix  (IIX  ai  in  Example  3J.  A  cloaed-fofm  expiea- 
skm  for  a  feneraior  of  (II)  for  block-Toepiiu  end  Toeplitz-block  nuaix  >1  can  be  found  in  Appendix. 


If  one  wish  to  compute  R~^  ditecUy,  then  one  can  pcrfonn  the  (m+n)  steps  of  partial  tri- 


angularization  with  the  matrix, 

■  -/  A  O 
M  =  O  I  . 
.010. 

Note  that 


A'^A  I  R‘  -00 

I  O  ~  U  OS 


and  therefore,  U  =  /?"’  because  UR  -  I 


B.  Removing  Forward  Elimination  in  Square  Systems. 

If  one’s  primary  interest  in  the  factorization  is  in  solving  a  square  system  of  equations, 

i4x  =  b,  (13) 

then  one  might  want  to  obtain  the  transfonned  right-side  vector  y  »  L~'b,  during  the  course  of 

the  factorization  process.  This  can  be  done  by  performing  the  following  partial  triangular  fac¬ 
torization  of  the  matrix  Af, 


-bn-  y^ 


whence  the  solution  to  (13)  can  be  obtained  by  solving  the  triangular  system  of  equations. 


L^x  =  y. 


C.  Removing  Back-Substitution  in  Square  Systems. 


From  a  hardware  implementation  point  of  view,  the  back-substitution  step  in  (14)  can  still 
be  quite  cumbersome  [6],  This  back-substitution  process  can  also  be  eliminated  by  perfoiming 
the  partial  factorization  of  the  matrix. 


A  — b 

^  *  [/  0  • 

Notice  that  the  solution  x  s  is  the  Schur  complemem  of  i4  in  M .  Therefore,  after  n  steps 
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of  partial  triangularization,  we  shall  have  a  generator  of  the  solution,  from  which  we  can  read 
out  the  solution;  see  [6]  for  details. 

D.  Solving  Least  Squares  Problems  without  Back-substitution. 

To  solve  the  weighted  least  squares  problem  of  minimizing 
IU2(A,X  -  b)ll2. 

where  A  i  and  A  2  are  block-Toepiitz  or  Toeplitz-block  matrices,  we  form  the  matrix 

—A  2  A 1  — b 
A/  =  aJ  O  0  . 

O  I  0 

Now  notice  that  the  least  squares  solution, 

x^iA\Ai^A{r^Alh  (15) 

is  the  Schur  complement  of  the  submatrix 


Therefore,  after  m+n  steps  of  the  get^ralized  Schur  algorithm,  we  shall  have  a  generator  of 
the  solution  (15),  from  which  the  solution  can  be  read  out  [6]. 

E.  Regularization. 

If  the  given  Toeplitz  least  squares  system  is  particularly  ill-conditioned,  it  is  meaningless 
to  compute  the  exact  Ocast  squares)  solution,  since  small  perturbations  of  the  matrix  can  cause 
very  large  perturbations  in  the  solution.  In  such  cases,  we  may  solve  die  following  regularized 
system  [10],  [19] 


by  partial  triangularization  of  the  matrix 
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m  0 

^  ~  ni  o  0 
.0^0. 

After  m+2n  steps  of  the  generalized  Schur  algorithm,  we  shall  have  the  solution.  We  may 
remaric  that  this  technique  of  regularization  is  known  as  the  leakage  method  (adding  white 
noise  with  variance  to  the  data  sample)  in  the  signal  processing  literature  (see  e.g.  [3]). 


5.  Construction  of  Proper  Generators. 

We  shall  present  a  method  for  constructing  a  proper  generator  using  spinors;  for  a 
method  using  Householder  matrices,  see  Appendix.  A  spinor  e  R®*®  is  defined  as  the 
identity  matrix  except  for  the  foUowing  4  entries. 


where  [A],  j  denotes  the  (f,  y)th  element  of  the  matrix  A,  and  +  Sjj2  =  1.  The  parameters 
[c,  Si,  S2)  will  be  called  Schur  parameters.  Notice  that  the  inverse  of  a  spinor  is  also  a  spi¬ 
nor,  viz.,  5yj,)  is  the  identity  matrix  except  for  the  following  4  entries. 


Let  x^  e  and  y^  e  R*^  be  row  vectors.  Let  c,  sj  and  si  be  chosen  as 


[xiyi  +Xjyj 

and  define  x'  and  y  by 


S2  =  -c-^, 
Xi 


Then  it  is  easy  to  check  that  x/  =  y/  =  0,  and  x'^y'  =  x^y.  We  shall  call  the  elements  z,-  and 
y,-  pivoting  elements.  Therefore,  by  repeating  this  process  we  can  annihilate  all  elements  of  x 
and  y  except  the  pivoting  elements,  resulting  in 
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An  arbitnuy  choice  of  pivoting  element  or  an  aibitraiy  ordering  of  annihilation,  might 

result  in  [1  +  ^  0,  for  which  real  spinors  do  not  exist  The  following  function  returns  an 

Xiyt 

index  "pvt",  and  two  sets  of  indices,  FIRST  and  NEXT;  if  we  annihilate  the  elements  in  x^ 
and  whose  indices  are  given  in  die  set  FIRST,  pivoting  with  the  element  Xpy,  and 
before  the  annihilation  of  the  elements  given  in  the  set  NEXT,  then  it  is  not  hard  to  see  that 

[1  +  ^  0. 

^pviypvi 


Procedure  FindOrdering 
begin 

Compute  Yi  =  x.y;  for  ail  I  ^  t  S  a 
•T  •—  » 

Pset  :=  { I  I  Yi  >  0  }:  Nset  ;=  { i  I  Yi  <  0  );  Zset  :=  {  i  I  Y;  =  0  }; 
if  s  >  0  then 

pvt  :=  any  i  e  Pset; 

FIRST  :=  Pset;  NEXT  ;=  Nset; 
else  if  s  <  0  then 

pvt  :=  any  i  €  Nset; 

FIRST  :=  Nset;  NEXT  :=  Pset; 
else  /*  Cannot  route  *f 
return  (s); 

Add  Zset  either  to  FIRST  or  NEXT; 
return  (pvt,  FIRST.  NEXT) 


With  FindOrdering,  we  can  summarize  the  procedure  for  constructing  proper  generators. 


Procedure  MakeProper 
begin 

:=  first  non-zero  row  of  X ;  y^  :=  first  non-zero  row  of  Y ; 
FindOrdering: 
if  f  s  0  then 

return  ("A  has  a  singular  minor"); 
for  each  J  e  FIRST,  and  then  for  each  J  €  NEXT 
Determine  5(ytp«)  to  annihilate  xj  and  y/, 

X  XSQ\p^yi  Y  •*  F5(;^) 

end; 

return  ((X,  F)) 
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Remark  5.1  The  quantity  s  =  is  used  to  find  the  annihilation  ordering  is  the  product 

of  the  diagonal  elements  Uji  i^kjL  of  the  partial  triangular  matrices  L  and  U  obtained  by  the 
generalized  Schur  algorithm.  Therefore,  j  >  0  for  positive-  definite  symmetric  matrices,  and 
s  <  0  for  negative-definite  symmetric  matrices.  Hence,  for  these  mainces  we  can  choose  a  sin¬ 
gle  column  as  a  pivoting  column  throu^ut  the  triangularization  process. 


Some  Special  Cases. 

If  we  are  given  a  symmetric  generator  of  a  symmetric  matrix  A ,  i.e.,  if  T  =  XZ,  then  the 
updating  of  Y  in  the  above  procedure  is  redundant,  because  the  updated  {X',  Y'}  after  annihi¬ 
lating  a  row  is  still  symmetric.  To  see  this,  let 

=y^  =  [jCp^.yy]. 

Then  the  spinor  that  annihilates  Xj  will  reduce  to  a  Givens  rotation. 


^  O'lp'’*)  * 


c  -s 
s  c 


+  1 


On  the  other  hand,  if 


=  l«pw .  1.  ,  ~Uj  ] 

the  spinor  will  become  a  hyperbolic  rotation. 


ch  -sh 
~sh  ch 


ch^  -sh^=l 


Notice  that  Givens  and  hyperbolic  rotations  preserve  the  symmetry  of  the  updated  generator. 


i.e.. 


YS~^  =  K'  =  X*!,  X'  =  XS,  5:  a  Givens  or  hypeibolic  rotafion. 

As  another  special  case  of  spinors,  consider  the  two  tow  vectors 

v’‘=[v^,0]. 

For  this  case,  the  spinor  that  annihilates  uj  will  reduce  to  the  usual  elimination  matrix. 


J  -<E 
0  I 


.  ICg 


(16) 
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Example  5.1  The  celebrated  "Routh  procedure"  [llj  for  stability  testing  is  just  a  triangulari- 
zation  using  our  generalized  Schur  algorithm  of  the  Hurwitz  matrix  of  Example  2.1. 


Remark  For  the  triangular  factorization  of  a  block-Toeplitz  matrix,  one  may  use  the 
block-spinor. 


I  K, 

-kI  / 


t/-*,  = 


/  Kt 

-K\  i 


L-^,  LU  = 


I+KiKI  O 
O  /  +  atJaTi 


to  annihilate  X  1,2.  pivoting  with  X ij,  by  choosing  K  =  XfjX2,i.  However,  the  use  of  matrix 


inversion  in  the  block  rotation  makes  the  control  flow  in  most  hardware  implementations  com¬ 


plicated,  and  therefore,  the  use  of  block  rotations  is  often  discouraged;  our  recommendation  is 
to  use  the  generalized  Schur  algorithm,  which  only  operates  on  selected  columns  of  scalars. 
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6.  Concluding  Remarks. 

We  have  shown  how  to  obtain  triangular  factorization  and  QR  factorization  of  Toeplitz- 
block  or  block-Toeplitz  matrices  in  0{mn)  flops.  Our  method  is  based  on  the  displacement 
structure  properties  of  matrices.  We  also  presented  some  other  applications  of  our  algorithm. 

We  have  generalized  earlier  definitions  (see  e.g.  [14-17])  of  the  displacement  for 
Toeplitz-like  matrices  and  presented  a  correspondingly  generalized  Schur  algorithm  for  their 
factorization.  The  extended  definition  allows  us  to  handle  block-Toeplitz  and  Toeplitz-block 
matrices  and  Schur  complements  with  respect  to  the  leading  (block)  entries  of  such  matrices. 
Composite  matrices  obtained  as  products  and  inverses  of  Toeplitz  matrices  can  be  nicely  han¬ 
dled  by  formulating  them  as  Schur  complements  of  entries  in  a  suitably  defined  block-Toeplitz 
matrix.  Some  interesting  examples  were  given  in  Sec  4;  Several  of  them  will  be  considered  in 
other  ways  in  other  clusters  in  this  thesis. 

We  also  mention  that  displacement  structure  can  also  be  introduced  for  Hankel  and 
Hankel-like  matrices  (see  e.g.  [17])  and  also  Vandermonde-like  matrices;  analogs  of  the  (gen¬ 
eralized)  definitions,  algorithms  and  implications  in  this  chapter  have  also  been  obtained  for 
these  matrices  (see  Qum^r  4  or  [S]). 


APPENDIX  1. 


Finding  Generators  of  Matrices. 

Once  we  have  tiie  displacement  of  a  matrix,  we  can  obtain  a  generator  of  the  matrix  by 
representing  each  pair  of  non-zero  columns  and  rows  that  crosses  at  the  main  diagonal  as  a 
sum  of  two  rank-one  matrices.  As  an  example,  the  disfdacement  (10)  can 

represented  as  follows; 


-44- 


Po  To' 

Pi  T  T 

^{f/ /•*)  ~  p2  ®i  +  ®i  [  0»  P-it  Yo»  Y-i"  Y-2  1  Yj  ’ 

Pi 

where  e,'  denotes  the  vector  with  1  at  the  ith  position  and  O’s  elsewhere. 

In  general,  the  following  procedure  can  be  used  to  find  a  generator  from  the  given  dis¬ 
placement  with  0(mn)  flops. 


Algorithm  (Finding  a  generator) 

Input:  The  displacement 
Output:  A  generator  {X,Y]  of  A 

Procedure  FindGenerator 
X  :=4>;F  .•=<!•: 

while  there  is  non-zero  column  or  row 

for  each  pair  of  a  column  u  and  a  row  that  crosses  in  the  i  th  position  of  the  main 
diagonal  of  begin 

if  Ui  *  0  then  begin 

a  n  ?=  u  except  s;  =  0; 

V  :=  v/u/'^,  V  :=  V  except  P}  =  0; 
end 

else  begin 

a  ?=  u  except  Hi  =  1/2;  H  ;=  u  except  S;-  =  -1/2; 

:=  V  except  P,-  =  1/2;  V  :=  v  except  P;  =  -1/2; 

end; 

X  :=[X,  a.  01;  Y  :=  [Y.  -n 
Remove  u  and  v; 

end; 

for  each  an  unpaired  ith  column  u  begin 
X  ;=  [X,  uj;  Y  :=  [Y,  e,]; 

Remove  u  and  v; 
end 

for  each  an  unpaired  j\h  row  begin 
X  .•=  [X,  eyl;  Y  :=  [Y,  vj; 

Remove  u  and  v; 
end 

return  (2f,  Y] 


end 
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APPENDIX  2. 

Construction  of  a  Proper  Generator  using  Householder  matrices. 

The  matrix  E  of  the  form, 

£=/-2uv^,  u^v  =  l 

will  be  called  a  Householder  matrix.  If  u  =  v,  then  the  matrix  E  reduces  to  the  usual  (orthog¬ 
onal)  Householder  matrix.  One  can  check  that  E  has  self-inverse.  i.e., 

E  =  £■*. 

Let  X  €  ye  R"’‘“  and  x'  e  R"*^®,  /  e  R"*®*  are  given  vectors  that  satisfy 
tJ  y  =  x'^y',  tJY  =  x'^y. 

For  such  vectors,  we  shall  show  how  U)  find  E  such  that 

x'^  =  x^£,  »  y^£^.  (A2.1) 

To  to  this,  first  notice  that  (A2.I)  can  be  re-written  as 

x'^  =  x^  -  2p,v^.  y'^=y^-2p2U^  p,  P2^y^v 

so  that 

V  =  (X  -  x']/2p,.  u  =  [y  -  yyi^  (A2.2) 

Therefore, 

y^v  =  P2  =  y^(x-x1/2p,.  x^u  =  p2  =  x^[y-y']/2p2 
Hence,  the  Householder  matrix  £  where  u  and  v,  and  therefore  Pi  and  P2  are  chosen  sudi  that 

2P1P2  =  y^l*  -  xn  =  x^[y  -  y^ 
will  satisfy  (A2.2). 

Let  {X,  F)  be  a  non-proper  generator,  and  let  x^  and  y^  denote  the  top-most  non-zero 
rows  of  X  and  Y.  To  make  {X,  K)  be  proper,  we  choose  a  pivoting  column,  and  post-  multi¬ 
ply  X  and  y  with  £  and  £^,  respectively,  so  that 

x'^  =  x£  =  [0.  •  • .  0.  •  • .  or.  =y£^  =  (0.  •  • .  0.y^',  0.  •  ■ .  0]^. 

We  summarize  this  procedure  below. 


-46- 


Procedure  MakePioper 
begin 

;=  first  non-zero  row  of  X;y^  :=  first  non-zero  row  of  Y ; 
s  :=x^y; 
if  5  B  0  then 

return  ("Matrix  has  a  singular  minor”); 

Choose  an  appropriate  column  as  pivoting  column; 

Xpvt'  •-  [i -Xpyt/yp^]^,  := 

P,  :=  1;  p2  :=  [X  -  x1^y/2; 
u  :=  [y  -  y']/2p2;  v  :=  (x  -  x^/2; 

X  :=X  -  2Xuv^;  Y  :=Y  -  2Yvu^; 
return  ({2f,  y)); 

end. 


APPENDIX  3. 


1.  Derivation  of  Levinson  Algorithm  fh)m  generalized  Schur  Algorithm 

Let  X^^^  be  the  generator  obtained  after  the  jfcth  step  (k  ^  n)  of  partial  triangularization. 
starting  with  the  generator 


x(*)  = 


0  ■  0  4+u  ■  ^»Jk  ®  0 

0  0  0  w*+,jk  •  0  •  0 


(All) 


If  we  can  obtain  l^jt  and  from  u,j’s  in  (A3.1),  then  we  can  compute  the  Schur  parame¬ 


ters  for  the  next  hyperbolic  rotation,  and  apply  the  rotation  cxily  to  the  bottom  part  of  (A3.1); 


«u  •  •  «*>  0  •  •  0 

.«M  •  •  “ijk  0  •  •  0 

First  notice  that 


kj,  =  n(cA.^  -  sfi.2)''2  =  n(l  - 


>•1 


Also  it  is  easy  to  check  that 


shi 


(A3:2) 


r 

1 

. 

r  q 

1  I22  h:s  '  U-iji-i  Ujt 

*  ft  fll-l 

1  “W 

w,,  0  0  0  0 

'■  ' 

0  «U 

W3.1  0 

‘1  h  V 

•  V 

"X 

0 

f(i-t  f)i-2  1 

0  0  II, ^ 

%.l  0 
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Hence 


«*-U 


W*+lJk  =  [  tk>  h-U  ‘  »  1  1 


(A3.3) 


“U 


and  therefore,  can  be  coni{Nited  by  (A3.2)  and  (A3.3)  yielding  the  Levinson 


algorithm. 


2.  Derivation  of  Lattice  filtering  Algorithm  flrom  generalized  Schur  Algorithm. 

Let  be  the  generator  obtained  after  the  kih  step  (k  ^  n)  of  partial  triangularization, 
starting  with  the  generator 

[O  •  4.*  knjt  •  Ujc  •  f*m-H*k-\Jt  0  •  ol^ 

[0  •  0  ®  ’ 

where 


[hl.j,  •  •  ,  =  (/14,  •  •  =  Uo. 

Again  it  is  easy  to  see  that 


/ 

1.1 

/u 

•  •  /l/ 

^0 

^1 

4-1  0 

. 

... 

0 

to  ti 

4-1 

- 

... 

• 

0  \\ 

... 

• 

h 

tt  ^ 

0 

/m -11+1,1 

0 

0  0 

to  4_i 

• 

... 

0 

0 

fm^ 

1 

•  4-I.II-1 

W2.1  0 

0  -  0 

0 

WS.!  W3J 

0 

• 

• 

• 

0 

>*'11.3  • 

0 

Using  (A3.4)  and  the  fact  that  T^Q  «  r7,  one  can  show  dial 


(A3.4) 
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the  i  th  diagonal  element  of  Q^S  =  hI;,. 

Therefore,  we  have  an  alternative  way  of  computing  the  reflection  coefficients. 


kt  = 


bib, 


APPENDK  4. 


We  shall  give  generators  of  smne  of  important  matrices  exidicitly.  By  using  these  gen¬ 
erators,  one  would  not  need  to  use  the  procedure  FindGenerator  in  Apperxlix  1,  and  for  those 
matrices  the  operation  count  wiU  reduce  significantly. 

Lemma  A4-1  Generator  for  block  Toeplitz  matrices. 

For  the  matrix  A ^  in  (la),  a  generator  {X,  Y]  cf  Ax  with  respect  to  (Z^r*  ^Nr)  ^  given 
by 


r 

Ir 

BiBo*  ■ 

Bo 

B.i  • 

X  = 

0 

f 

Y  = 

0 

-B-i  • 

■  -B-s-¥\ 

Proof.  By  applying  one  step  of  (block)  Gaussian  elimination,  we  have 

where  K  =  is  the  Schur  complement  of  Aq.  □ 

Lemma  A4-2  Generator  for  Toeplitz  block  matrices. 

Let  Aj  e  i!"*"  be  an  M  x  N  array  of  ToepUtz  matrices.  e  where 

it  N 

denote  the  first  ct^tunn  of  Tij  andhjj  denote  the  first  raw  of  Tij, 

i-I  iml 

with  the  first  element  equals  to  zero.  Then  a  generator  of  Aj,  (2(.  Y)  with  respect  to 

It  N 

{  ©^.  eZ^]  is  given  by 
•■I  !■! 
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ar.i  ‘  ej  •  •  • 

*1  ■  ■  ■  i»l.l  **2.1  ■ 

X  = 

*2.1  *2.2  •  *2A  •  «i  •  • 

.  Y  = 

•  «1  •  •  *>1.2  *>2^  ■  *>M^ 

*M.i  **f;2  ■  *M/r  •  •  ■  ei 

•  Cl  bijv  •  b)4ji 

Proof.  The  matrix  A  -  F\AF\  has  non-zero  elements  only  on  the  1st  row  and  1st  columns  of 
each  block.  Among  other  possibilities,  if  one  assigns  vertical  strips  to  x,-  and  the  test  horizon¬ 
tal  strips  to  y,-.  then  the  above  generator  results.  □ 


Lemma  A4-3  Generator  for  orthogonalization  of  block  Toeplitz  matrices. 

If  A\  is  an  M  x  N  ToepUtz  array  of  square  matrices  Bi  e  R'’”'  as  in  (la),  then  a  gen¬ 


erator  [X,  XI)  ofM 

r  w 

is  given  by  X  -  y 


A[A,  A] 
A,  / 


,  wiA  respect  to  F  -Zfi^ QZ^r.  where  Z  =  />  © 


where 


'  Bfft 

0 

'  0 

BiS 

B-i 

BiS 

.  . 

Bm-\S 

M-l 

where  S  . 

i-O 


0 

■ 

I 

BoS 

0 

Bu-\ 

,  Y  = 

BiS 

0 

BiS 

0 

Bi4-N*\  . 

.  Bi4-\B 

0 

Bm-\S 

0 

Proof.  TUs  is  a  straight-forward  extensitms  of  Lemma  2  in  [4].  □ 


Lemma  A4-4  Generator  for  Orthogonalization  of  Toeplitz-Mock  Row 


Let  A  =  [  r,,  T^,  ,Tfi  ].  where  Tj  e  R****^  are  Toeplitz.  Let  »j  be  the  first  column  of 
Tj,  bJ  be  the  first  row  ofTj  with  the  first  element  equals  to  zero,  and  cj  be  the  last  rowofTj 


shifted  right  by  one  position,  a  generator  (X,  X£)  of 

N 

B  “  t©Zfcl©Z^,  where  I  ■  Ip^i  ©  -In*\.  is  given  by 

i-i 


A^A  A^ 
A  I 


with  re^ct  to 
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where 


X  = 


Wj  W2  Vln 

ai/llaill  ayila^il  -  a^f/llaj^ll 


WiV+I 

*1 


U|  U2 

a|/lla|ll  aj/llajli 


U/V  '"‘TN+l 
aA//l)a/(rll  0 


W;  =  A  ^''•^’^a./l  la,  1 1,  W/V+I  =  [bf,  bj^. . . ,  b^ 

('■/)  r  T  T  T  tT 

U;  =  W;  '  .  W2;^,+2  =  (c/,  C{.  .  .  ,  Cjli  Y  , 

and  is  the  matrix  obtained  c^er  setting  the  first  columns  all  blocks  up  to  the  ith  block 
by  null  vectors. 


Proof.  Notice  that  the  1st  row  of  A^ A  -  FA^AF^  is  i4fi4,  the  WjUi  row  is  aIa^^K  and  the 
(fl,+/«2)th  row  is  i4pl®  and  so  on.  TTierefore.  to  each  of  above  rows,  pre-multiply 
corresponding  columns  crossing  on  the  diagonal  to  get  rank-one  matrices,  and  ai^ly  Cholesky 
factorizadon  to  each  rank-one  matrix.  The  first  Cholesky  factors  are  w,-,  1  ^  i  ^  and 
Schur  complements  are  w,-,  N+2  ^  i  ^  2N+1.  The  rest  part  of  the  matrix  A^A  -FA^AF^  is 
rank-one,  w^+iW2v+2" 

For  V,  notice  that  F)-  ^Iw+jCw/^+i+i,  F)  has  nonzero  elements  lU,-  II  on  the 

diagonal  of  the  T,-  block.  Therefore, 

^2/V+2('^i«  +  ^2?V+2(v^+i +1  >  ^)^2V+2('''^-M+It  F) 

forms  the  lower-triangular  part  of  the  T,  block.  lC2/v+2(^Af+i»  P)  forms  the 

rest  upper  triangular  parts  of  all  blocks.  □ 


Lemma  A4-5  Generator  for  OrUiogoiuUzation  of  stacked  Matrices. 

Let  A^  a\,  Alt .  1.  where  Ai  e  F"*”*.  Let  {X,,  be  a  generator  of 

AfAi  Al] 


widi  respect  to  where 


Ai  / 


Then  the  matrix 


M 


A^A  A^ 
A  I 


has  a  generator  (X,  X£},  with  respect  toF  ^  Fz^tOF^,]  where 


i-l 


W 

V 


W-IX„..,X„].  V-Fi©-  -©Fj,. 
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Proof  Because  A^A  =  '^JAi,  wc  have 

A'^A  -  F2A^AFI  =  B  AlAi  -  FjAjAiFl  ]  =  VPBV^. 

Also 

A  -  FAFl  =  =  WP^.  □ 

Lemma  A4-6  Generator  for  Orthogonalizatkm  of  Toeplitz  block  Matrices. 

Let  A  be  an  MyN  array  of  Toeplitz  matrices  Tjj  Then  the  maaix 

A^A  A^' 

^  j  has  a  generator  {X,  XZ)  with  respect  to  Fi=Z„^QZ„^Q  •  ■  ©Z^  and 
Fi  =  Z,,©Z,j©  •  •  ©  Z,^,  where 

X^[W^.V^f,  W=[Wi . W^,],  V  =  ^,©^2©- •  •  ©V/^. 

AlAi  -  F^AfAiFl  =  WitiWf.  A.-  -  Z^A.fJ  *  V^wT 

-  [  Fi  i,  Ti;i,  .  .  ,  r,_jv  ]. 

Proof.  Immediate  from  Lemma  A4-4  and  A4-S.  □ 
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Chapter  3. 

Generalization  of  Gohberg-Semencul  Formulas 

Abstract 

Gohberg  and  Semencul  gave  some  elegant  fonnulas  for  the  inverse  of  a  Toeplitz  matrix 
as  a  difference  of  products  of  lower  and  upper-  triangular  Toeplitz  matrices.  There  are  several 
algebraic  and  analytic  proof  of  these  formulas.  Here  we  give  a  "constructive"  proof  for  the 
Gohberg-Semencul  formulas,  under  the  assumption  that  the  matrices  are  strongly  non-singular, 
i.e.,  all  leading  minors  are  nonzero.  This  assumption  is  stronger  than  necessary,  but  it  enables 
fast  Oin^)  constructions  for  the  entries  in  the  Gohbeig-Semencul  formulas.  Our  method  also 
gives  a  natural  generalization  of  the  formula  to  matrices  with  displacement  structure. 
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1.  Introduction. 


If  T  is  a  Toeplitz  matrix, 

T  =  (Ci_;)  6 

then  one  can  easily  show  that  T  has  the  following  displacement  representation  [5,  13-16], 


Co  0  0  Co  c_i  ci^  0  0  0  0  0  ‘^-1  ^-2 

Cl  Cq  0  Co  C2^  Cl  0  •  •  0  0  C_1  C2-, 

coT=  •  •  -  Cl  ....  1) 

0  •  •  c_,  •  0  0  •  •  ‘^-i 

C«-1  C„_2  Co  0  0  Co  C,_i  C„_2  Cl  0  0  0  0  0 

It  is  an  inteiesting  faa  [5]  that  T*’  also  has  a  similar  displacement  representatioa  To 

give  an  explicit  formula,  we  assume  that  the  matrix  T  and  its  (n-l)  x  (n-l)  leading  principal 

submatrix  are  nonsingular.  Let  z  and  v  denote  the  first  and  last  columns  of  respectively. 

Then  it  turns  out  that  we  can  write 

21  0  0  V,  v,_,  V,  0  0  0  0  0  r,  r,_,  r, 

22  z,  0  V,  V2  V,  0  •  •  0  0  2,  22 

zj'r'=  •  •  •  •  .  _  .  V,  •  .  •  (2) 

•  •  0  •  •  v,_,  .  .  0  0  •  •  2» 

2d  2,_1  2i  0  0  V,  Vd_l  V„_2  Vi  0  0  0  0  0 

It  also  holds  that  zi  =  v,.  This  fonnula  was  first  given  by  Gohberg  and  Semencul  in  1972  [7, 
p.  86],  [10]  as  one  of  a  set  of  slij^y  different  foimulas  of  the  type  (2)  for  T~'.  We  shall 
describe  a  variant  in  Sec  2  (see  Remark  2  in  Sec  2).  Here  it  is  interesting  to  note  that  a  simple 
inspection  of  (2)  yields  the  following  recursive  formula  for  the  elements  of  T~', 


2t['r‘]i+ij>i  =  2i[’r‘].j  +  2,+,v,_y  -  v,2,_^+i,  [A],J  ■  (ij)  element  of  A,  (3) 
which  was  in  fact  first  derived  by  Trench  [21]  in  1964.  Its  relatimiship  to  the  Gohbeig- 

Semencul  formulas  and  connections  with  the  Christoffel-Darboux  fonnulas  for  orthogonal  poly¬ 
nomials  on  the  unit  circle  were  made  in  [16]. 
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As  one  might  expect  from  the  above  discussion,  there  can  be  several  ways  of  establishing 
Gohberg-Semencul  formulas.  But  all  presently  known  fvoofs  involve  a  certain  amount  of  alge¬ 
braic  and  analytical  manipulation.  In  this  chapter  we  shall  describe  what  may  be  regarded  as  a 
"constructive"  proof  for  Gohberg-Semencul  formulas  [7.  p.  86.  p.89]  under  the  rather  restric¬ 
tive  condition  of  strongly  non-singular  T,  i.e..  T  with  all  leading  minors  nonsingular.  On  the 
other  hand,  with  this  assumption,  there  are  "fast"  G(n^)  algorithms  for  actually  computing  the 
Gohberg-Semencul  expressions. 

Our  proof  follows  the  so-called  "array  method"  discussed  in  [4].  In  the  array  method,  the 
triangular  factors  of  T*'  and  T  are  simultaneously  obtained  via  a  sequence  of  elementary 
hyperbolic  rotations  applied  to  a  certain  "pre-array"  of  scalars.  We  show  in  this  note  that  a 
slight  modification  of  the  pre-array  leads  to  Gohberg-Semencul  formulas.  Our  approach  can  be 
extended  to  obtain  formulas  of  the  Gohberg-Semencul  type  for  "close-to-Toeplitz"  matrices. 

The  present  contribution  grew  out  of  our  studies  on  constructive  algorithms  [4]  for  fast 
triangular  and  orthogonal  factorizations  of  matrices  given  in  a  so-called  displacement  represen¬ 
tation  (see  [14]).  We  discovered  that  one  of  these  constructions  not  only  gave  a  Gohbeig- 
Semencul  formula  for  the  inverse  of  a  Toeplitz  matrix,  but  also  indicated  a  natural  method  of 
extending  the  formulas  to  general  non-Toeplitz  matrices. 

The  displacement  of  a  matrix  A  e  has  been  defined  as  [14] 

VAmA-Z^AZl.  (4) 

where  Z,  is  the  n  x  n  lower  shift  matrix  with  ones  on  the  first  subdiagonal  aixl  zeros  every¬ 
where  else.  Suppose  that  VA  is  expressed  as  a  sum  of  outer  products, 

VA  =  Zxij.K 

i-l 

Then  it  is  easy  to  see,  using  the  nilpotency  of  Z, ,  that  A  can  be  written  as  a  sum  of  products, 

A  =  XL{x,)L^(y,), 

i-l 


(5) 
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where  as  before  L(x)  denotes  a  lower  triangular  Toeplitz  matrix  with  the  first  column  x.  The 
expression  in  (S)  is  called  the  displacement  representation  of  A.  For  symmetric  matrices  we 
can  refine  this  to  the  form. 


P  P“^ 

A  =  -  2  L(x.)L^(*.)-  (6) 

i*l  i^+1 

For  a  simple  example,  consider  a  symmetric  Toeplitz  matrix,  T  s  cq  =  1-  Then 
we  can  see  that 


where 


VT=:T-Z.TZJ  = 


1  C|  •  • 


Cl 


o 


=  -  X2X2. 


Xj  *  (  I,  C  j,  ■  •  ■ ,  C„  1  ,  X2  —  (  0,  C  ,  C,i  ]  . 

It  follows,  as  noted  in  (6),  that 

T  =  L(x,)L^(x,)  -  L(x2)L^(X2).  (7) 

The  fact  that  the  Gohbeig-Semencul  formula  (3)  for  T~'  has  a  similar  form  as  (7)  is  not 

an  accident.  In  fact,  it  was  shown  in  [13, 14]  that  if  the  matrix  A  has  the  form  (S),  then  1a~‘1 
(where  1  is  the  matrix  with  ones  on  the  reverse-diagonal  and  zeros  elsewhere)  has  the  represen¬ 
tation. 


1A->1  =  iL(Xi)L^(yi).  or  A"' =  iL^(x.)L(y.)  (8) 

1-1  i-i 

with  certain  vectors  (xj,  y,- ).  (We  may  note  that  if  we  insist  on  a  lower-upper  type  representa¬ 
tion  for  A'*,  we  can  always  obtain  this  but  with  a  -t-  2  terms  in  general).  The  formula  (8) 
would  be  a  generalization  of  the  Gohberg-Smnencul  fonnula  to  arbitrary  matrices,  except  that 
the  proof  in  [14]  did  not  show  how  to  actually  construct  a  displacement  representation  of  1a~'1 
or  A~*  from  a  displacement  representation  of  A. 
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This  construction  will  be  supplied  in  this  chapter  for  a  large  class  of  matrices,  and  in 
principle  for  all  matrices.  We  shall  first  give  a  constructive  proof  of  Gohbeig-Semencul  for¬ 
mula  in  Sec  2.  In  Sec  3  we  shall  generalize  the  earlier  notion  of  displacement  representation. 
Then  we  shall  show  that  a  certain  partial  triangularization  procedure  of  the  2  x  2  block  matrix. 


A  = 


Ai.i  Aij 

A2.I  ^2^ 


Al  l  =  strongly  nonsingular,  (9) 

easily  yields  a  displacement  representation  of  the  so  called  Schur-comi^ement  of  A]  1,  i.e..  a 
representation 


P  P'W 

A2^  —  A2_iA|  }A|_2  =  “  £  L(*i)L^(yi)- 

As  an  example,  we  shall  obtain  the  Gohbeig-Semencul  fonnula  for  a  Toeplitz  matrix  by  apply¬ 
ing  the  procedure  to  the  block  matrix 


A* 


T  I 
I  O  ’ 


where  the  Schur  complement  of  T  is  just  -T*.  Then  in  Sec  4,  we  shall  apply  the  procedure 
in  Sec  3  to  various  matrices,  e.g.. 


T^T  I 

T,  T2 

'j'T*  'p  'pT* 

'pT*  'p  rjT 

I  0 

t 

T2^  0 

’ 

T  I 

f 

I  0 

to  obtain  generalized  GohbergSemenctd  formulas,  or,  equivalently,  displacement  representa¬ 
tions,  of  the  matrices. 


(T^T)-'. 


(jT-jj-ijr 


2.  A  Constructive  Proof  of  two  Gohberg-Semencul  Formulas. 

Let  us  consider  a  symmetric  positive  definite  Toq>liiz  matrix  T  =  (c,_y)  s  ,  co  =  I- 
Later,  we  shall  indicate  simple  modifications  of  the  proof  for  nonsyiametric  but  strongly  non¬ 
singular  Toeplitz  matrices.  As  shown  in  Sec  1,  T  has  the  representation. 


T  *  L(x,)L^(x,)  -  L(X2)L^(x2). 
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Let  us  define  the  pre-array  Aq,  and  a  diagonal  matrix  J, 


I 

■l(x,)  O, 

L(X2) 

e  R?*^,  J  a 

Ao  a 

I- 

I, 

-I, 

€  R 


3n>3it 


(10) 


where  O,  and  I,  denotes  the  n  x  n  null  matrix,  and  the  n  x  n  identity  matrix,  respectively. 
Suisse  that  we  can  find  a  J-orthogonal  matrix  0  e  ,  viz.,  one  satisfying  0JO^  =  J, 
such  that  the  post-array  AqQ  has  the  foim 


As  Ao0  = 


L  O. 


O. 


U  Uy,)  Uyj) 


(11) 


where  L  is  a  lower  triangular  matrix,  and  L(yi).  L(y2)  ^  lower  triangular  Toeplitz  matrices 
whose  first  columns  are  some  yi  and  y2.  respectively,  while  U  is  not  a  priori  constrained  in 
anyway.  Then,  because  of  the  J-orthogonality  of  0, 


AoJAo^  =  AjA^ 

which  yields  the  following  identities 


L(x,)L^(x,)  -  L(x2)L^(x2)  =  T  =  LL^,  (12a) 

L(x,)  -  L(X2)  =  I,  =  LU^,  (12b) 

UU^  +  L(yi)L^(y,)  -  L(y2)L^(y2)  =  O, .  (12c) 

From  (12b),  we  sec  that  L"‘  =  (therefore,  U  is  upper  triangular).  Now  (12a)  shows  that 

T*  =  L-^L-‘  =  UU^.  (13a) 


Therefore,  (12c)  implies  that  'P'  has  the  following  disi^acement  representation. 


P‘  =  L(y2)L^(y2)  -  L(y,)L^(yi).  (13b) 

Next  we  shall  show  how  to  construa  0  so  as  to  obtain  a  post-array  A  of  the  form  (1 1). 
This  can  be  done  in  several  ways,  but  pertu^  the  simplest  is  to  construct  0  as  a  product  of 
elementary  hyperbolic  rotations,  which  are  J-oithogonal.  (We  could  also  use  hyperbolic 
Householder  reflections.)  Hyperbolic  rotations,  H,-j(k)  are  defined  as  identity  matrices  except 
for  the  following  four  entries. 
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[H.  j(K)].^  =  [H.j(ic)l,  j  [Hi  j(ic)].-^  =  [Hi  j(K)]^^  =  (14) 

We  call  the  pair  of  iivlices  (i  J).  the  plane  of  rotation.  The  matrix  Hij(K)  is  real  and  finite 
when  IkI  <  1.  In  signal  processing  af^lications,  k  is  often  called  a  reflection  coefficient.  Let 
be  a  row  vector  with  Iwil  >  Iwy-I.  We  can  annihilate  wj,  pivoting  wiA  wi,  by  post- 
multiplying  with  Hij(Wj/Wi), 

[Wi  •  •  Wi  •  •  •  •  W,lHij(W^/Wi)  *  [w,  •  •  Wi'  •  •  w^_,  0  Wy+,  •  •  w,]. 

For  a  moment,  let  us  assume  that  the  magnitude  of  pivoting  elements  is  always  greater  than 

that  of  pivoted  elements  and  therefore,  that  iicl  <  1.  A  lemma  given  in  the  Aiqjendix  shows 

that  this  assumption  is  always  valid  for  a  positive  definite  T. 


A  Simple  Example. 

Our  construction  is  perhaps  best  followed  with  a  simple  example.  Thus,  before  we 
describe  the  general  procedure,  we  shall  illustrate  the  details  with  a  3  x  3  symmetric  positive 
definite  Toeplitz  matrix, 

1  a,  02 

T  =  Oi  1  Oi  ,  Xi  =  (1.  Oj,  02]^,  X2  =  [0.  Oj,  Ojl^. 

02  O]  1 

We  post-multiply  the  pre-array. 


■l(x,)  O,  Uxj)' 

Ao=  ,  o  I  e  (15) 

L  *11 

with  hyperbolic  rotations  H2,7(Kt)  and  H3  g(K|),  where  K)  ■  Oj,  to  annihilate  the  1st  sub¬ 
diagonal  of  L(x2)  pivoting  with  the  diagonal  of  L(X|)  (see  (16)  below).  To  preserve  the  Toe¬ 
plitz  structure  at  the  (2,  3)  block,  we  also  apply  a  "dummy”  hyperbolic  rotation  H4^Ki).  This 
will  introduce  a  nonzero  element  ^4  at  the  lower  left  comer  in  the  (2,  2)  block.  Tliese  steps 


are  illustrated  below. 
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Ao‘H2,7(Kj)-H3  g(Ki)  Ao-H2,7(Ki)*H3g(Kj)‘H49(Kj) 


Now  we  post-multiply  Aj  with  the  hypcibolic  rotation  HgTCKj),  where  K2  =  PyPi.  to 
annihilate  the  remaining  element  P3  in  the  (1,  3)  block  of  Aj.  Again  to  preserve  the  Toeplitz 
structure  in  the  (2,  3)  block,  we  aRjly  two  dummy  hyperbolic  rotations,  H4_g(K2)  and  H5,9(K2) 
to  obtain  A. 


Aj'H3^7(K2)  Ai’H3^7(K2)‘H4j(K2)‘H5^(K2) 
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T-‘  =  Uji)h^(j2>  -  Uji)L^(Jil 

The  general  procedure. 

In  general,  we  successively  annihilate  the  1st,  2nd,  ■  ■ ,  (n-l)th  subdiagonals  of  the 
lower  triangular  matrix  in  the  (13)  Uodt  of  Aq,  by  post-multiplying  Aq  with  a  sequence  of 
hypertmiic  rotations.  The  procedure  of  annihilating  the  tth  subdiagonal  will  be  called  the  ith 
sweep,  and  the  anay  obtained  after  the  ith  sweq>  will  be  denoted  with  Ai.  In  the  ith  sweep, 
we  apply  hyperboUc  rotations  on  the  foUovring  two  sets  of  idanes  in  A,_], 

Ai  *  {(i+13«+l),  (/+2,2«+2),  •  • ,  (n,3n-i)},  (18a) 

®  {(/i+l,3/t— I’+l),  (n+2,3n—l+2),  •  • ,  (n-fi,3n)},  (18b) 

where  >4,  and  f),-  stand  for  the  sets  of  planes  on  which  "annihilating''  hypeibolic  rotations  and 

"dummy"  hyperbolic  rotations  are  applied,  respectively.  Thus,  if  we  display  (18)  for  each 

sweep,  we  have  the  pictorial  representation: 

<=1  2  3  —  n  n+1 

i-2  3  n  rt+l  n+2 

\  M  \ 

i=n-l  n  n+1  n+2  —  2n-l 

pivoting  columns 

Within  a  given  Ai  (or  Oi),  any  ordering  of  rotations  can  be  chosen  because  the  planes  in  (18a) 
(or  (18b))  are  "disjoint". 

Noie  that  Aq  rias  Toeplitz  stnicture"  in  the  columns  (1, 2,  •  • ,  2n)  and 
(2n+l,  2n+2,  ■  • ,  3n).  Suppose  that  A,.]  has  Toeplitz  stnicture  in  the  columns 
((,  <+l,  •  ■  ,  2n)  and  (2n+l,  2n+2,  •  • .  3n).  Then  this  Toe(ditt  stnicture  allows  us  to  choose 
hyperboUc  routions  on  the  set  i4,-  with  identical  reflection  coefficients  iq,  to  annihilate  the  /th 
subdiagonal  elements  in  the  (2.3)  Mock  U  A-u 


2rt+l  2/1+2  -  3/1-1  3/1 

2/1+1  2n+2  •  3/1-2'^ 3/i-l  3n 


2/1+1  2n+2  -  3/1 

pivoted  columns 


A,_,'  ■  Ai.,H^(K,),  H,,,(Ki)  w  •  •  H,4,^(ic,).  (19a) 
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Furthennoie,  by  applying  hypeitolic  rotations  on  the  set  D,-  with  the  same  K,  ,  we  can  keep  the 
Toeplitz  structure  in  the  columns  of  (i+1,  i+2.  •  •  ,  2n)  and  (2rt+l,  2n+2,  •  • ,  3n)  in  A,-, 

A.  a  Ho/lfi)  a  (1C,).  (19b) 

In  each  sweep,  the  null  (1,2)  block  is  untouched,  and  the  lower  triangulariQr  of  the  (1,1)  block 

is  maintained. 

Hence,  A^.i  will  be  the  form  of  A  in  (11),  and  will  have  Toeplitz  structure  in  the 
columns  of  («,  n+l,  •  •  2rt)  and  (2/i+l,  2fl+2,  •  • ,  3rt).  Therefore,  the  diagonal  elements  of 
L(yi)  will  be  zero  and  the  first  column  of  L(yi),  (i.e.,  Ji)  will  be  identical  to  the  last  column 
of  U  shifted  down  by  one  position. 

Now  we  shall  show  that  the  displacement  representation  (13b)  obtained  by  the  above  con¬ 
struction  is  the  (first)  Gohberg-Semencul  formula  (2).  Notice  that 

T'e,  =  [L(y2)L^(y2)  -  L(y,)L^(y,)]e,  =  L(y2)L^(y2)e,  =  IL(y2)]i.iL(y2)e|, 

T'e,  =  (UU^)e,  =  (U],^U«„ 

where  e,-  denotes  the  vector  with  I  at  the  ith  position  and  0  elsewhere.  Therefore, 

First  column  of  Ujt)  =  L(y2)ej  =  (T~*ej)/(L(y2)]i,i  = 

First  column  of  L(yi)  =  L(yi)ei  =  sd(Ue,)  =  sd(T''e,)/[U],^  =  zr''^sd(v), 
where  jd(v)  derxrtes  the  vector  v  shifted-down  by  one  position,  and  we  have  used  the  easily 

verifiable  fact  that 

zi  =  V,  =  [L(y^]?,i  =  [Uli^. 

With  these  identifications,  the  displacemoit  representation  (13b)  is  exactly  the  Gohberg- 
Semencul  formula  (2). 

Remark  1.  The  sequence  of  hyperbolic  rotaticms,  H/(^(Kj)  (see  (19)  for  the  notation)  introduces 
the  ith  superdiagonal  in  the  (2,1)  block,  and  the  ith  subdiagonal  in  the  (2,3)  block.  On  the 
other  hand,  H/)^(tq)  introduces  the  (n-i)th  subdiagonal  in  the  (2,2)  Mock. 
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Remark  2.  Readers  may  have  noticed  that  our  constniction  of  A,  and  therefore,  the  resulting 
displacement  representation  (13b)  is  not  unique.  This  is  because  we  can  use  any  J-oithogonal 
transfonnation  matrix  6.  In  panicuiar.  we  can  apply  extra  hyperbolic  rotations  on  with 
any  IkI  <  1.  i.c.,  we  can  replace  A„_|  by 

A,_,Ho.(ic).  (20) 

and  still  have  the  fonn  (II).  In  fact,  the  second  Gohbeig-Semencul  formula  [7,  p.  89] 

corresponds  to  the  particular  disfdacement  representation  (13b)  obtained  by  choosing  the 
reflection  coefficient  k  in  (20)  as  the  last  reflection  coefficient  k,  of  any  nonsingular 
(n+l)  X  (n-i-l)  Toeplitz  matrix  whose  n  x  n  leading  principal  submatrix  is  T. 

Remark  3.  Note  that  our  construction  of  the  Gohberg-Semencul  formula  needs  O(n^)  opera¬ 
tions.  because  we  need  to  keep  track  only  two  columns  in  A,-,  and  ai^ly  only  (n-l)  different 
2x2  hyperbolic  rotations.  Our  construction  here  is  closely  related  to  the  classical  Schur  algo¬ 
rithm  (13,  18.  20],  which  has  certain  advantages  for  parallel  computatioa  The  first  and  last 
columns  of  T*',  which  defitK  the  matrices  L]  and  L2  in  the  Gohberg-Semencul  formula,  can 
also  be  obtained  with  0(n^}  operations  by  using  the  Levinson  algorithm  [S].  In  fact.  Trench 
[21]  used  this  algorithm  to  obtain  the  "differential  form”  (3)  of  the  Gohbetg-Sonencul  formula. 

3.  Generlaized  Displacement  Representations  and  Schur  Complements 

First,  we  shall  generalize  the  concept  of  displacemem  representation,  and  then  give  a  fast 
triangularization  algorithm  for  a  certain  2x2  block  matrix.  During  the  triangularization  pro¬ 
cedure,  displacement  representations  for  the  Schur  complement  of  the  (1. 1)  block  will  namr- 
ally  arise. 

The  following  sum-of-products  representation  of  a  matrix  A  e  is  called  a  (gerreral- 
ized)  displacement  representation  of  A  with  respect  to  the  displacement  iterators  (F],  F2): 

A-iK.(Xi.Fi)K:(y...F2). 
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where  Fj  e  /i"’*  and  Fj  e  are  nilpotent  matrices  of  index  less  than  or  equal  to  n,  i.e.. 
Ff  =  F^  =  0,  and  K,(x,,  F,)  e and  K,(y,-,  F2)  e are  the  so  caUed  Krylov 
matrices, 

Fi)  *  Fi*..  •  •  .  K^(3i,  F2)  »  [Ji,  FjJi.  .  .  .  FJ-‘yil. 

The  matrix  pair,  {  X,  Y  ),  where  X  s  [  X],  X2, . . .  ]  and  Y  »  [  yj,  72,  . . ,  Ja  ]  is  called  a 

generator  of  A  (with  respect  to  {F,.  F2)),  and  is  denoted  Ga(A,  Fi,  F2).  The  number  a  is 

called  the  length  of  the  generator  (with  lespea  to  {F,,  F2)).  A  generator  of  A  with  the 

minimal  possible  length  is  called  a  minimal  generator.  The  length  of  the  minimal  generator  of 

A  is  called  the  displacement  rank  of  A  (with  respect  to  {F,,  F2}),  and  denoted  as  a(A,  Fj,  F^. 

If  the  matrix  A  is  symmetric,  then  A  can  be  represented  as 

A  =  ZK,  (X. .  F)K,(x.-.  F)  -  L  K,  (X.- ,  I0K;(Xi .  F),  (21) 

«»1  i^-fl 

and  the  generator  G^+,(A,  F,  F)  can  be  written  as  {  X,  XS  }  ,  where  X  =  [  x, . x^^  ] 

and  Z  alp©-!,. 

A  displacement  representation  of  a  matrix  A  can  be  obtained  by  using  the  following 
lemma,  whose  simple  proof  we  shall  omit. 


Lemma.  For  any  A  e  ,  if  Fi  or  ¥2  is  nilpotent,  there  exists  a  ^  min(m ,  n )  such  that 


A  =  ZK,(x,,  F,)Kj'(y.,  Fj)  if  and  only  ^  A  -  F,AF2^  =  £x.y/'. 

•-»  i-l 

In  later  developments  an  important  role  will  be  played  by  2  x  2  block  matrices  (9).  For 
simplicity,  we  shall  first  consider  a  symmetric  matrix. 


A  ■ 


Ai.i  A,_2 

Au  A2J 


(22) 


where  Aj  ^  is  a  symmetric  positive  definite  matrix,  and  A2J  is  a  synunetric  matrix. 


For  the  matrix  A  in  (22)  we  shall  choose  both  displacement  operators  as 
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Z,  O 

'"■[o  • 

With  this  choice,  K,^(x,  ,  F)  in  (10)  will  have  the  fonn 

K,^(x.  .F)*  L(X2.i)  O  L(x,,)e /?->“.  L(x2,)  e 

where  [  xfj,  xjj  ]  ^  and  L(X|^)  and  L(x2j)  ^  lower  triangular  Toe(^tz  matrices  whose 
first  columns  are  Xj  ,-  and  X2,j,  respectively.  The  O’s  denote  rectangular  null  matrices  of 


appropriate  sizes. 


Generators  of  Schur  Complements. 

We  shall  now  show  how  to  obtain  the  displacement  representation  (with  respect  to 
and  Z„)  of  the  matrix  A^. 

A,  a  A2^  -  Aj^Ai"}Ai^, 
which  is  the  so-called  Schur  complement  of  A) 


1.  Obtain  a  generator  of  A,  say  (  X,  X£  },  £  =  0  -I,. 

2.  Forni  the  matrix  A, 

A  a  K(x„  F).  •  •  K(x^.  ¥).  K(x^^,.  10.  •  •  MXp^ .  F)  ]  (23a) 

fUx,.,)  Ol  fUx,,)  O]  rUx,,^,)  Ol  rL(x„^)  6] 

=  [  [UX2.,)  oj  '  •  [u*v)  oj  [Uxj,^,)  oj  •  •  [l(X2,^)  oJ  J-  (23b) 

3.  Post-multiply  A  by  a  J-orthogonal  matrix  O,  where  J  a  0~I(ii4m)f  O-c-  the 
matrix  9  is  such  that  9J0^  =:  J),  that  will  transform  A  as  follows  (We  shall  show  how 
to  do  this  in  the  proof): 


(i)  K(x,,  10 

(ii)  K(Xj,  F) 


L  O 
M  L, 


L  is  lower-triangular,  L|  is  lower  triangular  Toe|ditz 
L,-  is  lower  triangular  Toeplitz,  2  ^  i  ^ 


The  result  wiU  be 
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AmAe  = 

Then  it  turns  out  that 


r  t* 

L 

> 

0 

0  0 

f  “ 

0  0 

M 

.  V 

L2  0 

k  0 

0 

A,.i  =  LL"". 
=  ML^. 


A,  a  A2^  —  Aj^AiJAj^  =  2L,L,^-  2 

i“l  i»p+l 


(24) 


(25a) 

(25b) 

(25c) 


Note  that  the  generator  of  A,  has  the  same  p  and  9  as  the  given  generator  of  A  itself,  a 
fact  first  discovered  by  Morf  [19]  and  used  by  him  to  derive  ”divide-and'Conquer”-type  algo¬ 
rithms  (see  also  [3]). 


Proof  The  results  in  (25a,  b)  follow  uiunediaiely  by  equating  the  (1, 1)  and  (2, 1)  blocks  on 
both  sides  of  the  equality, 

A  =  AJA^  =  A0je^  A^  =  .  (26) 

Furthermore,  equating  the  (2, 2)  blocks  of  (26)  gives 

Aj^  =  MM^  +  ZLiLl -  LiLf, 

im\  <ap+l 

from  which  the  equality  (25c)  follows  by  TK>ting  that 

MM^  =  A^2L-^L->A,^  =  AfzArjAi^. 

The  only  thing  left  to  do  is  to  show  how  to  construct  O  so  as  to  obtain  A  of  the  form 
(24).  This  can  be  done  in  several  ways,  but  perhaps  the  simplest  and  the  most  useful  is  to 
construct  0  as  a  ptodua  of  (p-t^Xn-Hn)  x  (p+qXn+m)  circular  (or  Givens)  and  hyperbolic 
rotations.  Givens  rotations  G,  j(k)  with  reflection  co^kient  k,  are  defined  as  identity  matrices 
except  for  the  following  four  entries. 


We  annihilate  (  L(X|^.  L(xu)f  •  •  t  }  in  (23b)  with  n  sweeps  (Odi,  Ist, 

■,  (/t-l)st  sweeps).  The  ilth  sweep  annihilates  the  ikth  sub-diagonals  of  the  p-1  matrices. 
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{  L(X|,2).  L(xi;3)-  ■  .  L(Xi^)  }  With  Givens  rotations,  and  the  A; th  sub-diagonals  of  the  9 
matrices.  {  L(xi^+i).  L(x,,+2)-  • .  L(xi,+^)  1  with  hyperbolic  rotations,  pivoting  with  the 
diagonal  elements  in  L(xt,i)  in  both  cases. 

In  the  A:th  sweep,  if  A;  >  n-m,  then  we  apply  ‘dummy’  rotations  to  the  (nHfc-i-l)st  to  the 
/nth  columns  of  K(Xi,  F).  pivoting  with  the  (n-t-l)st  to  (m-i-iL)th  columns  of  K(xi,  F),  in  order 
to  keep  the  Toeplitz  structure  in  {  L(x2^,  L(x23). . ,  L(x2^4^)  ).  This  will  introduce  a  non¬ 
zero  lower  triangular  Toeplitz  matrix  in  the  (2,  2)  block  of  K(xi,  F).  After  the  (/i-l)st  sweep, 
we  shall  have  A  in  (24). 

□ 

Operation  Counts. 

To  annihilate  n  rows  out  of  n+m  rows,  we  shall  need  approximately 

n-HR 

4(p+q-l)Y,k=2(p+q-\)y.in^+2nm+2m+n)  multiplications.  This  will  be  less  than  the 

0(n^)  multiplications  needed  to  obtain  the  factors  of  a  matrix  A]  1  and  its  inverse  unless  p+q 
is  neaiiy  n .  There  are  many  interesting  matrices  (see  Sec  4)  for  which p-¥q  c  n. 


Example  (A  Gohberg-Semencul  formula  for  Toei^tz  matrices). 

Let  T  a  (c,-_y  )  be  an  n  X  n  Hermitian  positive  definite  Toe{4itz  matrix.  Consider  the 
matrix 


A  ■ 


T  I» 

I-  o 


(27) 


For  F*Z,©Z,,  it  is  easy  to  see  that  the  displacement  tank  of  A  with  re^a  to  {F,  F)  is 
two,  and  that  a  generator  C2(A,  F,  F)  is  given  by  (X,  XZ).  where 


Xm 


*1.1 

Co  Cj 


*1,2 

Co  «! 


Z«l©-1. 


where 
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*1.1  =  Cq^'^Icq.  c,.  •  • ,  c,_,f .  x,_2  =  CS^IO,  c„  •  • .  ei  =  [  1,0,  ■  .of . 

With  this  generator,  the  matrix  A  in  (23)  will  have  the  foim 

■l(x,,,)  O  L(x,^  o' 

O  oj  • 

Mow  note  that  tte  matrix  A  in  (28)  is  in  the  fonn  same  as  (10).  The  proceduie  will  triangular- 
ize  the  matrix  A  into  a  matrix  A  of  the  fonn  (II), 

[l  o  o  o' 

U  L,  Lj  O  • 

Since  the  Schur  complement  of  T  in  A  in  (27)  is  -T**,  fonnula  (25c)  will  ^eld 

T-‘  =  LjLJ  -  L,L,r 


Non  Symmetric  Matrices. 

For  non-symmetric  rectangular  matrices  A  one  can  fonnulate  a  similar  procedure.  We 
form  the  matrices  At  and  A2  as 

A,  ■  [  K(x,,  F,).  •  • .  K(x„.  F,)  ] ,  Aj  *  [  K(y,,  Fj),  •  • ,  K(y„.  Fj)  ] ,  (29) 

and  annihilate  the  elements  in  A|  and  A2  by  post-multiplying  A]  and  A2  with  spinor  matrices, 
instead  of  Givens  and  hyperbolic  rotations.  A  spinor  matrix  is  defined  as  the  identity  except 
for  the  following  four  entries. 


(1  +  KiKj)*^ 

Because  of  the  invariance  property  thiU 


»  * 


-Kt 


(1  +  KjICj) 


Vi  »  = 


^2 


(1  +  tCjICj) 


M  • 


A  =  AfAj^  =  AiS,jS,“y'Af, 

a  similar  procedure  to  that  in  the  symmetric  case  will  give  a  displacement  representation  of  A, , 


A#  ■  At^  -  A2,iAf}Ai^  = 


(30) 


where  L,  and  Vf  are  lower  triangular  Toeplitz  matrices. 
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4.  Some  Generalized  Gohberg-Semencul  Formulas. 

The  first  step  of  the  procedure  to  obtain  a  generator  for  the  matrix 
A,  =  A2^  -  A2,iAi"}Ai_2  is  to  find  a  generator  of  the  matrix 


A  = 


At.i  A|^ 
Aj,i  A2^ 


This  can  be  done  by  using  the  lemma  in  Sec  3.  In  this  section,  we  shall  give  generators  of 
some  interesting  A’s,  from  which  the  correspotKling  Gohberg-Semencul  formulas  for  the 
matrices  of  interest  will  be  evident 


Example  1  (Generator  of  inverse). 

Let  B  be  an  n  X  n  symmetric  positive-definite  matrix,  with  a  known  generator, 
{  W,  WX  },  Z  =  Ip  ©-I^  with  respea  to  {Z,,  Z„ }.  Consider  the  matrix. 


B  I, 

I-  o 

Then  by  using  the  lemma  in  Sec  3,  it  is  easy  to  see  that  {  X,  XZ  ),  where 


(31) 


Xs 


W, 


'fp  Wp+1  •  •  Wp^  e, 


0  ei/2  0 


0  -«,/2 


2*  =  Ip+i  ©-I, 


*?+i 


(32) 


is  a  generator  of  A  with  respect  to  {Z,  ©Z,,  Z„  ©Z,  ).  The  length  of  the  generator  of 
obtained  with  the  above  A  will  be  p+^+2.  However,  if  the  given  generator  (B.  Z, ,  Z, ) 
satisfies  a  condition  called  admissibility  [18]  then,  as  we  shall  see,  we  can  obtain  a  generator  of 
B"*  with  a  length  less  than  p +17 +2. 


Example  2  (Inversion  with  admissible  generators). 

A  generator  for  a  Hermitian  matrix  B  e  R"’*,  {  W,  WZ  },  Z  =  Ip  ©-I,  with  respect  to 
(Z;,,  Z, }  is  called  admissible  if  e|  e  range  (W),  i.e.,  if  there  is  a  linear  combination  of  the 
columns  of  the  generator  that  will  give  the  unit  vector. 
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Let  Z„.  Z„)  be  admissible,  and 

[4!.  1^2.  •  • .  =  e{. 

Then  it  can  be  checked  that  the  matrix. 


B  I, 
In  m. 


p  p+q 

n*Zin.i'-  £  1^/'^ 


has  a  generator  (  X,  XX  ).  with  lespea  to  {Z*  ©Z„,  Z„  ©Z, },  where 


(33) 


Wj  Wp  Wp+l 


H,ei  ■  •  UpC,  -Hp+ie,  •  •  -iip+qCi 


Esip©-!,. 


(34) 


Since  the  Schur  complement  of  B  in  (33)  is  -  B~*,  we  see  that  the  generator  of  B"* 


obtained  with  the  A  in  (33)  will  have  length  p+<?+l  if  ^  0,  orp-hj  if  t)  =  0,  consistent  with 


the  results  hist  obtained  in  [18]. 

For  an  example  of  a  minimal  admissible  generator  (besides  generators  of  Toeplitz 
matrices),  let  us  consider  an  m  x  n  Toeplitz  matrix  T  =  (c,_;)  with  a  full  column  rank-  The 
matrix  T^T  has  a  minimal  generator  [4],  {  W,  WE  ),  E  =  hO-h^  respect  to  {Z,,  Z, ), 
where 


Wj  =  T^ti/lltill2.  W2  =  t2,  W3  =  z,zjwi,  W4  =  Z,lj,  (35a) 

fl~(  l-O*  I'  '  '  •  I  1  •  ^2“!  •  ^l-n  3  •  3l~[  ^m—h  ’  ‘  ‘  3  »  (35b) 

and  1 11 12  denotes  the  Euclidean  norm.  This  generator  of  T^T  is  admissible  since 


[I/lltill2.  0-  -IflltiHz-  03W^  =  el. 


(36) 


Therefore,  the  matrix  A  in  (33)  would  have  the  form 


r  T  I, 
I,  o 


(37) 


and  the  procedure  will  give  a  generator  of  (T^T)"‘  of  length  4.  The  displacement  representa¬ 
tion  of  (T^T)"'  is  useful  in  solving  Toeplitz  least  squares  problems  (m  t  n). 


Example  3  (Generator  of  'T,), 
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Let  T,  e  and  T2  =  (c,_y)  e  be  Toeplitz,  and  symmetric  positive-definite 

Toeplitz,  lespectively.  If  we  define 


tT  O 


(38) 


L  *1  J 

then  A  has  a  generator  {  X,  X£  }.  X  ^  respect  to  {Z„  0Z„,  0Z„ },  where 


V,  U2  V2  U2 
U|  e]/2  U|  -e]/2  ’ 


vi  =  the  first  column  of  T2  devided  by  c^, 

U]  =  the  first  column  of  Tf, 

V2  =  same  as  Vi  with  the  first  entry  equals  to  zero, 

U2  =  the  first  column  of  with  the  first  entry  equals  to  zero. 
With  the  above  A,  we  can  obtain  a  generator  of  T/T£*Ti  with  length  4, 


The  displacement  representation  of  T/Tf'T]  is  useful  in  solving  weighted  Toeplitz  least 
squares  problems  (n  km),  as  arise  in  certain  parametric  time  series  identification  problems. 


Remark  1.  One  can  obtain  a  generator  of  TfTi  by  setting  T2  in  (38)  with  I„.  This  procedure 
will  need  0(mn)  computations,  which  is  of  the  same  order  as  evaluating  the  closed  form 
expression  in  (3Sa). 


Remark  2.  It  is  interesting  to  note  that  oCTi^Tf'Tj,  Z„,  Z„)  ^  4,  whereas 
^  reason  is  that  the  first  matrix  can  be  identified  as  the  Schur 

complement  in  a  2  x  2  block  matrix,  while  to  do  so  for  Ti^T2Ti  requires  going  to  a  3  x  3 
block  matrix  whose  displacement  rank  can  be  6. 

Remark  3.  Once  one  has  a  generator  of  one  can  obtain  a  generator  of  (T/Tf’Ti)"‘ 

using  the  matrix  A  in  (31)  and  the  generator  in  (32).  This  will  give  a  generator  of  length  6. 
However,  it  turns  out  that  minimal  generators  of  'Tj  are  admissible  (with  ri  =  0).  and  the 
displacement  rank  of  (TfXf'Ti)"*  is  less  than  or  equal  to  4  (see  Sec  5). 
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Example  4  (Generator  of  the  projection  operator). 

Let  T  be  an  m  x  n  Toeplitz  matrix  with  full  column  rank.  If  we  define 

X  x^ 

T  l„\' 

then  {  X,  XZ  },  Z  a  IzG-h'  where 


(39) 


Xa 


W,  W2  W3  W4 

Ij/llt|ll2  Cl  t|/llt|ll2  0 


w,-  and  t|  are  as  in  (35). 


(40) 


is  a  generator  of  A  with  re^ct  to  (Z,  Z,  0Z„ ).  By  applying  the  procedure  in  Sec  3 
to  the  above  matrix  A,  we  can  obtain  the  generator  (of  length  4)  of  the  projection  operator 
I„  -  t(T^T)~*T^  on  the  ((/n-/i)-dimensional)  kernel  of  T^.  Also,  in  this  case,  the  matrix  M 
in  (24)  turns  out  to  be  the  orthogonal  basis  of  the  range  of  T,  because  by  (25a,  b) 

T^T  =  LL^,  and  T  *  ML^.  which  implies  M^M  =  I, ,  Me  . 

In  fact,  we  have  the  QR  factorization  of  T  (see  also  [4]). 


Example  5  (Generator  of  the  pseudo-inverse). 

Let  T  be  an  m  x  n  Toeplitz  matrix  with  a  full  column  rank.  If  we  define 

'pT*  'p  »pr 

A  a  I  o  ' 

*11  ^itxm 

then  by  using  the  results  in  (36)  and  (40),  we  can  see  that  A  has  a  generator,  {  X,  Y  },  with 
respect  to  (Z,  ©Z,,  Z,  ©Z„ )  whcie 


Wj  W2  W3  W4 

e,/llt,ll2  0  e,/))t,ll2  0 


Wj  W2  -W3  -W4 

t|/lltjll2  Cj  — tj/lltjll2  0 


and  w/  and  t|  are  as  in  (35).  With  the  above  generator,  we  can  obtain  a  generator  of  the 


pseudo-inverse  of  T  of  length  4. 


5.  Concluding  Remarks. 


We  have  presented  a  constructive  approach  to  the  famous  Gohberg-Semencul  formula  for 
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the  inverse  of  a  Toeplitz  matrix,  and  obtained  various  generalizations  of  it  We  note  that  for¬ 
mulas  of  Gohberg  and  Semencul  type  are  closely  related  to  the  problem  of  finding  displace¬ 
ment  representations  (generators)  of  various  matrices. 

For  further  development,  we  may  mention  that  the  procedure  for  2  x  2  block  matrices 
given  in  Sec  3  can  be  easily  generalized  toN  x  N  block  matrices.  For  instance,  by  considering 
3x3  block  matrices,  one  can  obtain  a  generator  of  [A2^  -  Ai^A  ,“}Ai^"‘.  As  an  example, 
we  can  obtain  a  generator  of  (Ti^Tf  *Ti)“*  by  working  with  the  matrix 


r  Ai.i  Ai_2 

- 1 

H 

< 

O  ■ 

A  a 

> 

O 

1 _ 

•  = 

1 - 

H 

O 

L _ _ 

> 

II 

I 

The  length  of  the  obtained  generator  will  be  4.  As  another  example,  a  generator  of 
can  be  obtained  by  choosing 


Ai.i  = 


Tj  I 
I  O 


O 

Ti 


We  also  remark  that  divide-and-conquer  versions  of  the  procedure  in  Sec  3  can  be  readily 
obtained  [31.  By  using  this  af^roach  to  compute  the  displacement  representation  of  (T^T)~', 
one  can.  for  instance,  obtain  least-squares  solutions  for  Toeplitz  systems  in  0(mlo^m)  opera¬ 
tions. 


There  exists  an  interesting  relationship  between  the  reflection  coefficients  of  a  Toeplitz 
matrix  T  and  those  of  [IS].  A  constructive  proof  of  this  relationship  can  be  found  in  [2]. 

Hnally,  we  should  note  that  several  authors  have  explored  the  problem  of  fast  inversion 
of  various  structured  matrices  by  employing  somewhat  different,  but  related,  definitions  of  dis¬ 
placement  We  may  mention  the  work  of  Heinig  and  Rost  [12],  Gohberg  et  al  [9],  [10],  L. 
Lerer  and  M.  Tismenetsky  [17].  Some  of  the  formulas  therein  are  also  generalizations  of  the 
Gohberg-Semencul  formula.  More  work  needs  to  be  done  to  clarify  the  relationships  between 
these  different  results  and  approaches. 
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APPENDK 


Lemma  Al.  Let  T  be  a  symmetric  Toeplitz  matrix.  Then  the  {k,-  }  defined  by  the  construction 
in  Sec  2  will  be  less  than  one  in  magnitude  if  and  only  i/T  u  positive-definite. 


Proof.  If  Ik,  I  <  1  for  all  1  ^  ^  n-1.  then  we  can  complete  the  tiansfonnation  to  get  the 
matrix  A  in  (11).  and  T  =  LL^.  Hence.  T  is  positive-definite.  Now.  let  us  assume  that  T  is 
positive  definite,  and  Ik^  I  <  1  for  1  ^  y  ^  i-1.  Alter  the  (i-l)st  sweep,  the  upper-half, 
of  the  matrix  A,_i  has  the  form. 


A  O  O  O 

B  C  D  O  • 


A.^,J[A.^,f 


=  T. 


where  A  is  a  nonsingular  lower  triangular  matrix,  and  C,  D  are  lower  triangular  Toeplitz.  Let 


c  and  d  denote  the  diagonal  elements  of  C  and  D,  respectively.  Suppose  that  Ic  I  ^  Idl,  and 
therefore,  Ik,-  I  ^  1.  Then  T  cannot  be  positive-definite,  because 

\^(J)  ^  s^A,^,J[A;e,f  s  =  c2  -  ^  0. 

where 


s^  =  [-b/^A~‘,  ej ],  ef  =  [1, 0,  •  • ,  0],  b?^  *  the  first  row  of  B, 
which  leads  to  a  contradictioa  □ 
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Chapter  4. 

Displacement  Structure  for  Hankel,  Vandermonde  and 
Related  (Derived)  Matrices 


Abstract 

We  introduce  some  generalized  concepts  of  di^lacement  structure  for  stnictured  matrices 
obtained  as  products  and  inverses  of  Toeplitz,  Hankel  and  Vandennonde  matrices.  The  Toe- 
plitz  case  has  already  been  studied  at  some  length,  and  the  corresponding  matrices  have  been 
called  near-ToepliU  or  Toeplitz-like  or  Toeplitz-derived.  In  this  chapter,  we  shall  focus  mainly 
on  Hankel-  and  Vandermonde-like  matrices  and  in  particular  show  how  the  appropriately 
defined  displacement  structure  yields  fast  triangular  and  orthogoral  faaorization  algorithms  for 


such  matrices. 
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1.  Introduction. 


Many  signal  processing  problems  require  solving  large  systems  of  linear  equations,  either 
directly  or  via  (weighted)  least  squares.  The  basic  solution  tools  are  triangular  factorization 
and  QR  factorization.  These  factorizations  require  O(n^)  or  0(mn^)  flops  (floating  point 
operations)  for  an  m  x  n  matrix,  which  can  often  be  excessively  large.  Therefore,  attention 
focuses  on  structured  matrices,  with  an  eye  both  to  computational  reductions  and  to  imfflemen- 
tability  iu  special  purpose  (parallel)  hardware.  Structured  matrices  arise  in  various  problems  in 
coding  theory,  interpolation,  control,  signal  processing  and  system  theory. 

Very  common  examples  of  structured  matrices  in  the  above  areas  are  Toeplitz,  Hankel 
and  Vandermonde  matrices: 


r=(t.-;)  = 


^0  ^-l  • 
^1  ^0  ■ 

^11-2  ’ 


e  R' 


H  =  (A.>y-2)  = 


/to  hi 
hi  hi 

K-\  h„ 


hiK-i 


e  R" 


V  m  V(cJC)  = 


—  - 

—  t^K  — 


K  =  diag(  *2.  •  • .  k^),  ki*0,  c  =  [1,  1,  •  • .  if. 


G  R" 


(1) 


(2) 


(3) 


These  matrices  have  certain  shift  invariant  properties:  The  (infinite)  Toeplitz  matrices  are  diag¬ 
onally  shift-invariant.  Hankel  matrices  are  "reverse  diagonally"  shift-invariant,  and  Vander¬ 
monde  matrices  are  vertically  shift-invariant  apart  from  K. 
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However,  it  is  easy  to  see  that,  for  example,  Toeplitz  stnicture  is  not  invariant  under  vari¬ 
ous  frequently  occurring  operations  such  as  multiplication,  inversion,  triangular  and  orthogonal 
factorization.  What  is  often  invariant  is  a  suitably  defined  notion  called  the  displacement  rank. 
For  example,  let 

VA  a  A  -Z^AZj,  A  e  R"**,  (4) 

where  Z,  is  the  n  x  /i  lower-shift  matrix  with  ones  on  the  first  subdiagonal  and  zeros  every¬ 
where  else,  and  define  the  displacement  rank  of  A  by  rank(VA).  Then  it  can  be  shown  that  a 
Toeplitz  matrix  T  and  its  inverse  T~^  have  the  same  displacement  rank. 

rankiVT)  =  rankiVT^)  =  2.  (5a) 

Moreover  if  7  is  Heimitian.  then 

inertia  (VT)  -  /nertia(Vr“').  (5b) 

Displacement  structure  as  just  defined  has  by  now  been  studied  in  some  detail  (see  [12] 
for  a  recent  survey,  and  also  [13],  [17]).  with  many  results  for  closely  related  definitions  in  [8] 
and  [10]  among  others. 

In  this  chapter,  we  shall  briefly  study  an  extended  form  of  (4), 

aA  (6) 

and  especially  another  definition 

aF^'A  -AF*^,  (7) 

where  the  matrices  (F^,  F^ )  can  be  fairly  general  subject  to  certain  restrictions  described  in 

Sec  2. 

For  reasons  that  will  soon  appear  we  shall  call  the  matrices  snd  A^/ jr»yA  the 

Toeplitz  displacement,  and  the  Hankel  ^placement  of  A  with  respect  to  displacement  itera¬ 
tors  (F^.  F^}.  The  rank  of  will  be  called  the  Toeplitz  displacement  rank  of  A 

(with  respect  to  (F^,  F*J),  and  denoted  as  The  rank  of  be  called 

the  Hankel  displacement  rank  of  A  and  denoted  as  Pgr/ • 
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We  shall  say.  roughly,  that  a  matrix  is  "stnictuied"  (e.g.,  "close-to-Toeplitz"  or  "close-to- 
Hankel"),  it  the  matrix  has  a  dis;dacement  rank  (with  respect  to  some  {F^ ,  F^}),  indqjendent 
of  the  size  of  matrix.  The  interesting  fact,  which  enables  fast  algorithms  for  triangular  and 
orthogonal  matrix  factorization  and  matrix  inversion,  is  that  such  structure  is  inherited  under 
inversion,  multiplication  and  Schur  ctnnplementation.  This  chapter  will  demonstrate  this  fact 
for  various  types  of  structured  matrices. 

First  in  Sec  2,  we  establish  the  claims  just  made  about  inversion,  etc.  Based  on  these 
results,  we  present  a  general  factorization  algorithm  in  Sec  3,  which  will  be  further  specialized 
in  later  sections.  Thus  in  Sec  4.  we  shall  show  how  to  compute  the  triangular  factorization  of 
Vandermonde  matrices.  Triangular  factorizations  of  close-to-Hankel  matrices  will  be  presented 
in  Sec  5,  and  QR  factorizations  of  Vandermonde  matrices  in  Sec  6.  References  to  prior  work 
on  each  of  these  applications  can  be  found  in  the  corresponding  sections.  This  chapter  comple¬ 
ments  our  recent  work  on  Toeplitz  and  close-to-Toeplitz  matrices  (see  Chapter  2  or  [S]).  On  a 
first  reading,  readers  can  skip  all  "Remarks"  in  this  chapter  without  loss  of  continuity. 

2.  Some  General  Properties  of  Displacement  Operators. 

In  this  section,  we  shall  examine  useful  choices  for  the  displacement  operators  [F^ ,  ) 
in  the  general  definitions  (6)-(7)>  and  derive  some  results  on  Schur  complements  that  will  allow 
us  to  easily  study  the  displacement  structure  of  matrix  products  and  inverses. 

One  of  the  criteria  for  choosing  displacement  operators  is  to  make  the  corresponding  dis- 
placeroem  ranks  of  M  as  small  as  possible,  because  as  will  be  seen  presently  the  displacement 
rank  determines  the  complexity  of  various  operations  on  A.  The  special  structures  in  the 
matrices  (l)-(3).  and  the  results  (5)  naturally  suggest  tiie  shift  matrix  Z,  as  an  important  candi¬ 
date.  In  fact,  for  a  Toeplitz  matrix  T  e  R””",  a  Hankel  matrix  H  e  and  a  Vandermonde 
matrix  V  »  V(c,iir)  e  R”’*”,  we  can  readily  see  that 
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^0 


has  rank  2. 


(8a) 


has  rank  2, 


(8b) 


^(Z.X 


-.)V'=Z,V-VA-*  = 


has  rank  1. 


(8c) 


But  how  about  non-Toeplitz,  non-Hankel  or  non-Vandennonde  matrices,  but  ones  that  arc 
known  to  be  the  inverses  of  some  Toeplitz,  Hankel  or  Vandermonde  matrices,  rcspectively? 
The  following  lemma  indicates  that  these  matrices  also  have  displacement  structure,  a  fact  first 
noted  in  [13]  for  Toeplitz  matrices. 


Lemma  2.1  Displacement  rank  of  Inverses.  For  any  nonsingular  matrix  A , 

Proof. 


=  rankliA-F^  AF'’^)A-^]  =  rankll-Ff  AF'^^A-^ 

V**’  KA-'-F^^A-^F^  )A  ]  =  rank[I-F‘'^A-^F^  A  ]. 

But  the  nonzero  eigenvalues  of  {FfAW'^^A'^)  and  arc  identical  Therefore. 

«(/r/  =  0l(f*r Next,  notc  that 

=  rank[{Ff  A-AF^'^)A-^]  =  rank[{Ff -AF'^'^  A'^)], 

P(FM-j,/Ty4-'  =  rank[AiF'*'^A~^-A-^Ffy\  =  rank  [/If 
and  therefore.  P(p,^»y4  =  P(p»r^/r^"'.  □ 
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In  paiticular,  we  see  immediately  from  Lemma  2.1  and  (8)  that 

rank  (r"*  -  zIt%  ]  =  rani  [r  -  Z,  TZl]  =  2,  (9a) 

rank -  H-% ]^ranklZ„H  -  HZl]  =  2,  (9b) 

rank  [«■“'  V*  -  V'Z,  ]  =  rank {Z,  V  -  VAT*]  =  1 .  (9c) 

Similar  results  can  be  obtaii^  for  matrix  products.  However,  it  will  be  useful  to  first 
consider  the  displacement  properties  of  the  so-called  matrix  Schur  complements.  The  forma¬ 
tion  of  Schur  complements  is  the  heart  of  triangularization  procedures  for  matrices.  Moreover 
we  shall  see  that  working  with  Schur  complements  of  appropriately  defined  block  matrices 
leads  immediately  to  results  on  the  displacement  structure  of  matrix  products  (and  inverses). 
The  following  lemma  generalizes  a  result  first  given  in  [19]. 


Lemrtu  Displacement  rank  of  Schur  complements.  Let  S  e  be  the  Schur 

complement  of  A  €  inM  e  R'"”',i.e.,  let 


M 


A  B 
C  D 


O  O 
O  5" 


e  R"”*,  S  -CA~'B,  :  nonsingular. 


IfF^  e  R""*"  and  e  R"^  are  block  lower  triangular  matrices,  i.e., 


F^  = 


F{  O 

H  Fi 


F^  = 


Ft  O 
Ft  Ft 


F{  e  R' 


Ft  6  R' 


(10) 


then 


Proof.  It  is  not  hard  to  check  that  has  the  form 


w-'  = 


*  * 

*  r‘ 


Therefore, 


(lla) 

(llb) 


(12a) 
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-  P(f*r^/ryV/  *  S  P(Ff ~  ~  P(F^/’*)'^’  (12b) 

where  the  hist  and  third  equalities  in  (12a}  and  (12b)  follow  from  Lemma  2.1.  The  secmd 
inequalities  follow  from  the  block  lower  triangularity  of  and  F'*,  and  from  the  fact  that  a 
submatrix  has  smaller  rank  than  the  matrix.  □ 

Remark  2.1.  Without  the  triangularity  assumpdmi  on  F^  and  the  Schur  complements 
may  have  larger  displacement  rank  than  die  matrix  itself.  See  Remark  2.4  below. 


Applications. 

Judicious  use  of  Schur  complements  will  allow  us  to  easily  derive  the  displacement  pro¬ 
perties  of  matrix  inverses  and  matrix  products.  For  example,  we  could  alternatively  have 
obtained  the  results  in  (9)  as  follows.  Consider  the  (extended)  matrices 


T  I 

H  / 

V  / 

Af  1  m 

/  0 

.  Ml » 

/  0 

,  Afs  m 

/  0 

and  dehne  the  displacement  operators 


(I3a) 


Zl  O 
O  Zl  ' 


z,  o 
O  zl 

*•  J 


Then  by  using  Lemma  2.2  one  can  check  that 


Z,  O 
O  K~^  ' 


K-^ 

O  Zl 


(13b) 


®(z:.2j)^‘  =  1  =  2. 

(13c) 

P(ZJ’7.V^~*  ==  P(F*F,)W  2  =  2, 

(13d) 

P(«r-‘/»)''  ‘  =  P(FiF|)^3  =  1- 

(13e) 

Lemma  2.2  is  also  useful  in  determining  the  displacement  ranks  of  products  of  matrices. 
For  examine,  let 


Af, 


/  T 

r*"  o 


,  Fm 


Z,  O 

O  z. 


Then  -T^T  is  the  Schur  com(dement  of  /  in  Af  i.  and  therefore. 


(14a) 
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Similarly,  let 


I  H 

O 

tti 

0 

.  F* 

0 

Then,  we  can  see  that  has  low  Hankel  displacement  with  respect  to  {Zj,  zj}  (instead  of 
{Z,,Z,})  because 


P(f /•)W2  =  4  = 

Finally,  for  the  produa  of  Vandennonde  matrices,  consider  the  matrix 


(14b) 


A/,s 


I  V' 
V  o 


r-X 


o 
o  Z. 


Then  we  can  see  that  has  rank-2  Hankel  displacement 


This  is  not  surprising  because  is,  in  fact,  a  Hankel  matrix.  Also  some  experiments  will 
show  that  V^V  docs  not  seem  to  have  a  low-disjdacement  rank  with  respea  to  "simple"  dis¬ 
placement  operators. 


Remark  12.  Referring  to  the  result  in  (14b),  we  may  note  that  not  ,  but  rather  , 
where  7  is  the  reverse-identity  (ones  on  the  anddiagonal)  matrix  has  low  Hankel  displacement 
rank  with  respea  to  the  usual  Hankel  displacement  operators  (Z,,  Z„}.  To  see  this  consider 
the  following  matrix  and  displacement  operator. 


■  7  H 

Zn  0 

Afj" 

0 

.  F« 

0 

Then 


(15) 


P(F/')^2  =  4  = 


Remark  12.  By  considering  3  x  3  or  higher  order  Mock  matrices,  one  can  daermine  the  dis¬ 
placement  rank  of  other  composite  matrices.  For  instance,  the  matrix 
where  Hi  is  a  Hankel  matrix,  has  Hankel  displacement  rank  4  with  respea  to  (F,  F }.  where 
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z„  o  o 

Fa  O  Z,  O 
O  O  Z, 

Remark  2.4.  One  might  have  noticed  that  the  Hankel  matrix  H  also  has  a  low  ToepUtz  dis¬ 
placement  rank,  viz.  that 

The  only  difficulty  of  using  this  displacement  in  the  context  of  the  fast  algorithm  in  Sec  3.  is 
that  the  displacement  operator  is  not  a  block  lower  triangular  matrix.  Therefore,  the  Schur 
complements  of  H  can  have  larger  displacement  ranks  than  that  of  H.  With  a  sUght 
modification  of  Lemma  2.2,  one  can  show  that  all  Schur  complements  of  a  Hankel  matrix  with 
respea  to  this  displacement  operator  have  rank  3. 

Remark  2J.  A  displacement  of  a  matrix  A  can  characterize  the  matrix  A  if  we  can  solve  the 
equations  (6)  and  (7)  for  A  uniquely  (the  equations  (6)  and  (7)  are  necessarily  consistent  in  our 
context).  It  is  well  known  that  the  (consistent)  equation  (6)  uniquely  determines  the  solution 
A,  regardless  of  what  the  displacement  operators  are.  However,  the  (consistent) 

equation  (7)  has  non-unique  solutions  (see,  e.g.,  [11],  [IS])  if  (and  only  if)  there  exists  a  pair 
of  eigenvalues  X,(F^ )  and  Xy(F‘)  such  that 

XiiF^)-Xj(F^)  =  0.  (16) 

This  condition  holds  for  most  of  the  displacement  qrerators  for  cIose-to-Hankel  matrices  that 

we  are  interested  in.  which  is  unfortunate  because  we  are  concerned  how  to  find  L  and  U  such 

that  A  =LU  (in  a  fast  way),  given  only  the  displacement  of  A .  We  shall  circumvent  this 

non-uniqueness  problem  by  imposing  an  additional  constraint  (see  Sec  3). 

3.  Fast  Partial  Triangular  Factorization  using  Hankd  Generators. 

In  the  rest  of  this  chapter,  we  shall  only  consider  Hankel  displacement  structure. 
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Corresponding  results  on  Toeplitz  displaamcnt  can  be  found  in  [5].  Also  it  is  important  to 
mention  at  this  point  that  we  consider  only  strongly  nonsingular  matrices  for  triangular  factori¬ 
zation,  mdfiUl  column  rank  matrices  for  QR  factorizatioa 


General  Schur  Reduction. 

We  note  (see,  e.g.,  [9],  also  (5],  [17],  [18],  [23],  [24])  that  the  standard  triangular  factori¬ 
zation  procedure  can  be  regarded  as  arising  from  the  recursive  computation  of  the  Sdiur  com¬ 
plements  [Si }  of  the  leadmg  principal  submatiices  of  a  given  matrix.  Let 
Ao  a  So  ■  A  e  .  and  define  1.  e  R"’“.  u.  e  R"^‘,  d.  0  and  the  ith  reduced  matrix  A. 
recursively  by 


Ai_i  = 


dll — - ]+A,,  A, 


Si 


(Ha) 


Given  Aj_i,  we  can  determine  the  quaruities  in  (17a)  as 


u.  =A?lie,,  =  1/[I.],  =  l/[u,]i,  (17b) 

where  e,  is  the  unit  vector  with  one  at  the  ith  position  and  zero’s  elsewhere,  and  [v],  denotes 

the  ith  element  of  the  vector  v.  The  computation  of  the  reduced  matrix  A,  from  A,_i  will  be 

called  (one  step)  Schur  reduction  [9],  [17].  Using  r  Schur  reduction  steps,  we  can  obtain  the 

(r-step)  partial  triangular  factorization. 


A  =  il,d,u/'  +  A,. 
1-1 


PNl 

* 

dx 

oo' 

II  1 

1 _ 

\ 

dr 

»  . 

[V.- 1 

+ 

O 

■  fZ)U^+A,. 


(18) 
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The  ”ti'^>ezoid"  matrices  L.  U  and  the  diagonal  matrix  D  will  be  called  (r-step)  partial  tri¬ 
angular  factors  of  A .  Notice  that  r  Schur  reduction  stq>s  take  O  (r/t^)  flops. 

Schur  Reduction  using  Hankel  Displacements. 

The  displacement  rank  of  a  matrix  can  be  much  smaUer  than  the  rank  of  the  matrix  A 
itself.  Also  if  the  chosen  displacement  operators  for  a  given  matrix  A  c  R”’” 

satisfy  the  condition  in  Lemma  2.2  for  aUl^i  ^  r ,  viz,  that 

the  r  X  r  leading  principal  submatrices  of  F-^  and  F^  are  lower  triangular,  (19a) 
or  pictorially, 

(19b) 


then  the  displacement  ranks  of  the  reduced  matrices  (A,-:  0  ^  ^  r)  do  not  increase: 

P(f/ ^  P(F/ 1  i  i  ^  r.  (20) 

Note  that  is  determined  only  by  0(pn)  parameters,  whereas  the  matrix  A,  itself 

needs  0(n^)  parameters.  Therefore,  we  can  hope  that  the  Schur  reduction  procedure  in  (17) 

can  be  done  more  efficiently  with  0(r^)  flops  (instead  of  O(rn^)  flops)  if  we  successively 

compute  rather  than  A,-.  This  is  indeed  the  case  as  we  shall  see  shortly.  For  the 

rest  of  this  chapter,  we  shall  restrict  ourselves  only  to  di^lacement  operators  that  have  the 

form  (19),  because  (20)  is  a  nice  property  to  have. 

Recalling  the  definition  of  displacement, 

“  A,_iF*’^,  and  F*  have  the  form  (19), 

we  can  check  that 

”  (F>'A,-,-A._,F»^)ei  =  (F/-{F*].^/)li.  (21a) 

-  (-F*AU-^AUFf^yti  =  -<fMf^].,/)u..  aib) 

where  [F\j  denotes  die  {l,JyO^  elemertt  of  F.  Therefore,  I,-  and  u,-  can  be  obtained  Iqr 
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where  the  superscript  #  denotes  an  appropriate  generalized  inverse.  After  determining  I,-,  u,- 
and  di,  we  can  obtain  the  displacement  of  the  reduced  matrix  A,-  as 

<23a) 

=  F/A._,  -  -  F/|.d.u?'  +  (23b) 

=  V'F*/'-!  -  +  I.riiU?F*^.  (23c) 

If  (F^-[F*],_,/)  and  (F*-[F^1,^/)  are  singular,  then  there  are  many  I,  ’s  and  u,’s  that 
would  satisfy  (21).  Let  us  consider  separately  the  nonsingular  and  singular  cases. 

A.  Nonsinguiar  cases. 

If  (F'^-[F*],  ;/)  and  (F*^F■^],•^•/)  are  nonsingular,  i.e.,  if  [F*],-  ,-  and  [F-^],,,-  are  not 
eigenvalues  of  F^  and  F*,  respectively,  then  1,-,  u,-  and  </,•  arMl  therefore  A,-  will  be  determined 
uniquely  by  taking  the  ordinary  inverse  in  (22). 

Example  3.1.  Let  F^  =  Z,  and  F^  =  if"*.  These  displacement  iterators  are  useful  for  the 
Vandermonde  matrix  V  =  V'CcJf )  €  R'”®'  because  they  give  the  smallest  displacement  rank. 
Note  that  (Z,^if  "*],_,/)  and  (/("'-[Z,  ],_,/)  =  if"*  are  nonsingular  for  \  ^i  ^n. 


B.  Singular  cases. 

If  (F^^F*]j^/)  and/or  (F*-{F^l,j/)  are  singular  then  the  Schur  reduction  using  (22) 
and  (23)  is  ambiguous.  Note  that  we  are  not  completely  hee  in  choosing  F^  and  F^,  because 
the  structure  of  a  given  matrix  dictates  appropriate  F^  aixl  F^  that  give  the  smaUest  displace¬ 
ment  rank.  Instead,  we  shall  overcome  this  difficulty  by  using  the  following  two  approaches. 
The  key  observation  behind  these  approadies  is  the  fact  that  only  the  projection  of  I,  and  u, 
lying  in  kemel(,F^ -{F^]i^l)  and  kerneliF^-{F^]i^l)  are  not  uniquely  determined. 

If  we  have  additiorud  information  about  such  ambiguous  components  of  I,-  arvl  u,-,  then 
we  can  determine  Ij  and  Uj  correctly.  We  shall  use  this  r^tproach  for  the  triangularizatkm  of 
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the  inverse  of  Vandermonde  matrices  in  Sec  4. 

The  other  approach  is  to  extend  the  matrix  A  e  to  a  larger  one.  say  i4  c 
such  that  >4  is  a  leading  principal  submatrix  of  A.  Let  L  e  R”*^  and  U  e  R"^  be  /i-step 
partial  triangular  factors  of  A.  Di^lacemoit  operators  and  for  A  are  chosen  such  that 
the  following  is  tnie, 

L  l.kernel{P-{P]i^l)  and  iF X^kemel{F^ -{pf  ]^I),  (24) 

where  AA.B  denotes  A^B  =  O.  Now,  we  can  perform  n-step  partial  tiiangularization  of  A 

using  (22)  and  (23)  unambiguously,  because  we  can  compute  1)  (the  ith  column  of  L)  and  n,- 

(the  f  th  column  of  U^)  by  taking  the  Moore-Penrose  inverse  (pseudo-inverse)  in  (22).  After 

finding  the  n-step  partial  triangular  factors  of  A  e  R'"’^,  we  can  obtain  the  triangular  fac~ 

tors  L  and  U  of  A  simply  by  deleting  the  m  -  n  rows  of  C  and  U. 


Example  3J.  Let  H  g  R**^  be  a  Hankel  matrix  in  (2).  For  Hankel  matrices,  the  (desirable) 
displacement  operators  are  (2^,2^).  Note  that  (Z,-{Z;, ],•,,•/)  is  singular.  However,  if 
we  define 


H  0 
0^  0 


G  r(''+>M«+»,  (25a) 

then  w  still  small  (in  fact,  4),  and  partial  triangular  factors  of  Hi  and  the  dis- 

placement  iterators  [Z^+i,  satitfy  (24). 


Example  3J.  Let  H  g  R"’*"  be  a  Hankel  matrix  in  (2).  Define 


G  R 


2iix2ii 


(25b) 


H  Uk 
Vg  O 

where  Ug  is  the  "reverse  upper  trianguloF'  Hankel  matrix  (with  zero  elements  in  die  lower- 


right  corner)  such  that  Ui  is  Hankel.  As  an  example,  for  a  3  x  3  Hankel  matrix,  Ug  has  die 


form. 
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ho 

fix  fl2 

As  A4  0 

H  ~ 

fix 

fl2  As 

A4  0 

fiz 

As  fl4 

0 

Now  the  partial  triangular  factors  of  Hi  and  the  displacement  operators  {Zj,.  )  sati^ 

(24). 


Generators  of  Matrices. 

For  a  given  matrix  A  e  R””^.  ai^  matrix  pair,  (JIT,  Y )  such  that 

=XY^.  X  a[x,.X2....Xp]e  ^  ^  [  y„  yj.  .  .  .  yp  ]  e  R"*^ 

is  called  a  (Hankel)  generator  of  A  with  respect  to  [F^ ,  )  .  The  numbers  P  are  called  the 

length  of  the  generator  (with  respect  to  ,  F^)).  A  generator  with  respect  to  [F^,  F*) 

with  its  length  equal  to  the  displacement  rank  is  called  a  minimal  generator  (with  respect  to 

(f/,f‘}). 


Example  (continued).  Generator  of //{.  The  matrix  Hx  in  (25a)  has  the  displacement. 


0 

Ao 

K-i 


K-x 


K 


'  ^2m-2 


-A«-i 

~fln 

“A2ji-2 

0 


and  ther^ore  has  a  generator.  {X],  Yi),  where 


10  0  0 

0  0  0  1 

0  •  -A,  Aq 

-ho  A,  •  0 

•  0  -Aa,_2 

.  1^1  = 

fl2ii-2  0  ■ 

(26a) 

0  1  0  A,_, 

-A,-,  0  10 

Example  3  J  (continued).  Generator  of  ffz.  The  matrix  Hi  At  (25b)  has  the  placement. 
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and  thertfore  has  a  generator,  {Jf2,  where 


1  0  •  •  0 

T 

0  -ho  • 

X2  = 

0  Ao  •  •  ^2,-2 

K> 

II 

10- 

0 

Fast  Schur  Reduction  using  Hankel  Generators. 
Now,  let 


(26b) 


Notice  that  the  matrix  products  involving  in  (22)  can  be  done  more  efficiently  as 

e;  =  (27a) 

where  matrix-vector  products  are  petfonned  in  the  sequence  as  shown  with  the  square-brackets. 

Fuitheimore,  a  generator  of  ri,-  can  be  obtained  as 

^(.)  ^  -Pf  \idi,  \idi).  =  [1'^'''^  Uf.  F*u,].  (27b) 

because  of  (23c).  Although  the  generator  given  in  (27b)  is  not  minimal,  it  is  possible  to  delete 

the  two  redundant  columns  in  and  in  (27b)  in  an  efficient  way  [16]. 

However,  the  above  Schur  reduction  procedure  is  still  not  efficient,  because  of  the  matrix 
inversions  required  in  (22),  and  the  matrix-vector  multiplicatitHis  F^\i  and  F^u,-  in  (27b). 
Nevertheless,  for  structured  matrices  A  (e.g.,  Hankel,  block-Hankd,  Hankel-block,  Vander- 
moide  etc),  displacement  operators,  F^  and  F^  are  extremely  simple  so  that  such  operations 
are  trivial. 


4.  Fast  Triangular  Factorixation  of  Vandermonde  Matrices. 


The  problem  of  finding  the  coefficients  of  the  nth  (kgree  interpolating  polynomial  can  be 
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formulated  as  a  problem  of  solving  an  (n+1)  x  (n+1)  Vandennonde  matrix  equation.  Bjoick 
and  Pereyra  first  noted  [2]  that  the  divided-difference  scheme  (which  needs  flops)  for 

finding  Newton’s  form  of  the  interpolating  polynomials,  in  fact,  solves  the  Vandermonde  matrix 
equations.  They  also  presented  an  algorithm  that  needs  Oin^)  flops  for  the  factorization 
=  UL .  along  with  other  extensions.  Recently.  Gohbei^g,  Kailath  and  Koltracht  [8]  obtained 
the  algorithm  of  Bjorck  and  Pereyra  by  a  different  route  and  gave  different  extensions.  In  this 
section,  we  shall  present  two  fast  algorithms  for  computing  the  factorizations  V  =  LU  as  well 
as  V~^  =  UL.  We  believe  that  our  approach  is  more  fundamental  and  provides  richer  insight 

Consider  the  Vandermonde  matrix 


where 


V  =  VicJC)  = 


E  R 


jixn 


(28a) 


K  =  diag(  kt,--,  *,),  =  II.  1.  •  • .  1].  (28b) 

Note  that 

■  e,  €  R-**.  yW>  »  e  R"*'.  (29) 

Therefore,  V  has  a  generator  {x^®\  y^^}  of  length  1,  Now  the  Schur  reduction  steps  special¬ 
ized  for  Vandermonde  matrices  can  be  summarized  in  the  following  theorem. 


Theorem  4.1.  Let  Then 


I.  (30a) 

u,  =-Ar[x^*"*^,y^*"0,  (3()b) 

di  =  I/O.],-.  (30c) 

and  a  x^'V'  ^*’,  where 

jfi) «  [y('-«)],^,x(‘-»  -  dilUiUM  +  (4f[ur].-*,/*,>,)Ii]/Ix<*>j..^,.  (31a) 

y(i) ,  (3Ib) 


Proof.  (30)  follows  immediately  from  (22).  From  (27b),  we  have 

X(0y(0T  ^  _  2,l,d,-u/‘  +  (32) 

The  matrix  which  is  the  displacement  of  the  ith  reduced  matrix,  has  the  null  rows 

and  columns  from  1  to  i.  Because  it  is  a  rank-one  matrix,  a  minimal  generator  of 

can  be  obtained  simply  by  taking  the  (i-t-l)st  row  and  the  (i+l)st  column  with  an  appropriate 

normalization; 

x(‘)  =  y^‘)  =  (33) 

The  generator  {x^‘\  y^‘^)  in  (31)  follows  from  (33),  after  inserting  (32)  for  □ 

Now  we  shall  summarize  the  algorithm. 

Algorithm  4.1.  Fast  Triangular  Factorization  of  a  Vandermonde  Matrix 
Input:  A  generator  y^°^}  in  (29)  of  V  »  Vq; 

Output:  Triangular  factorization.  V  -LU  \ 
for  i  :s  1  to  /I  do  begin 

Compute  I/,  u,,  d,  using  (30);  /*  0(n)  flops  (sec  Remark  4.1  below)  */ 

Obtain  a  generator  of  V,-  using  (31);  /*  0(n)  flops  */ 

end 

:=  [li.  I2.  •  • .  ;=[u,,  112.  •  •  ,u„);  O  ?=  diag(d,, 

return  ({£,,  U,D\)\ 

Remark  4.1.  The  matrix-vector  multiplication,  p  ■  in  (30a)  needs  only  0{n) 

flops,  because  it  is  essentially  the  badc-substitution  procedure  solving  the  bi-diagonal  system, 
(/-k.Z,)p  =  x<‘-». 

Remark  4.2.  The  above  algorithm  can  be  applied  to  any  matrix  V  of  the  form  (28a)  with  an 
arbitrary  lower  triangular  matrix  AT  and  any  vector  c  (rather  than  the  c  in  (28b)).  However,  for 
such  cases  the  algorithm  may  take  greater  than  0{n^)  flops. 
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V  =  L,DUi.  I  =  UjDUx.  I  =  LxDL2.  (36b) 

Therefore,  we  can  obtain  the  factorization  V'”’  =  U^DL^  by  the  n-step  partial-triangularizalion 

of  the  matrix  M  using  the  fast  Schur  reduction.  To  do  so.  we  first  need  to  find  a  generator  of 

M  with  respect  to  appropriate  displatrement  operators.  Our  choices  of  displacement  operators 

are 


K~^  O 

o  cl  ’ 

where  C,  is  the  circular  shift-down  matrix,  i.e., 

0  1 
1  0 

Cn  =  e  R-’^.  (37b) 

1  0 

Note  that  where 

x^°^  =  eie  R^*',  -ej]^  e  R^**,  e,  e  R"**'.  (38) 

Note  that  and  in  (37a)  have  the  fonn  in  (19).  Also  (F*^F^ ],^/)  is  nonsingular 

for  1  ^  ^  n,  and  therefore,  u,-  can  be  obtained  by  taking  die  ordinaiy  inverse  in  (22).  How¬ 
ever,  (F^ -(F*l,^/)  is  singular  for  1  ^  i  ^  r.  and 
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kernel(F^  -  [F*],  ,/)  =  ipanj  ^  |,  ze  R"***,  e,-  e  R"’“,  1  ^  j  Sn,  (39) 
where  z  is  the  null-vector,  i.e.,  [zl,-  =  0  for  all  t.  Therefore,  if  we  take  the  pseudo-inverse  in 
(22)  for  I,-,  then  only  the  element  is  not  determined.  However,  note  that  this  element 
[1;  can  be  determined  by  other  means.  Namely, 


[1.],^  =  l/«[u,].), 
because  U2  =  (DU j)"'  from  (36b). 


Remark  4J.  The  reason  of  using  the  F*  in  (37a)  rather  than  F*  in  (13b)  is  to  make 
F*’-[F^]i ;/  be  nonsingular.  We  carmot,  however,  use  C„  in  the  place  of  Z„  for  ,  because 
the  resulting  F^  would  not  have  the  form  (19). 

Now  we  shall  summarize  the  n  -step  partial  triangularization  with  the  following  theorem. 
The  proof  is  similar  to  that  for  Theorem  4.1,  and  we  shall  omit  it 


Theorem  4.2.  Let  where  Mq  »  M  and  F^,  F*  are  as  in  (34)  and 

(37a).  respectively.  Then 


u,  =  -(F^-[Ff  ,  /)-*y^'-’ e.- ,  (40a) 

I.  =  (F/-[F‘]„/)V‘-'y'-‘)^e.  +  e,^/((/,[u,].).  (40b) 

di  =  1/[I,],,  (40c) 

where  A*  denotes  the  pseudo-inverse  of  A .  Also,  where 

x(.) ,  ^  [y<‘-')i.^,xf-»  -  d,[u,l.>,F^I.-  +  (di[u,],>,/k,>,)l,]/[x(‘>l,>„  (41a) 

y(.)  =  [x(‘-')],.^,y<‘-‘)  -  djl.  l^u,  +  d.[li]i^,F^Ui.  (41b) 

Note  that  the  computations  of  (40)  take  only  0(n)  flops  because 


(F^-[F%ir^ 


K  O 

O  c,  • 

b  a 

-*.(/-*, ZJ-‘ 

o 


o 

1C<‘)  • 


K^‘'^mK  except  [AT 


0. 
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We  shall  summarize  the  algorithm. 

Algorithm  4J.  Fast  Triangular  Factorization  of  V~^. 

Input:  A  generator  in  (38)  of  W  a  Mq  e  R^’^; 

Output:  Triangular  factorization,  V“'  =  UL; 
for  i  :=  1  to  n  do  begin 

Compute  I,  ,  u,  .  ri,  using  (40);  /*  0(n)  flops  */ 

Obtain  a  generator  of  V,-  using  (41);  /*  0(n)  flops  */ 

end 

L  :=  [Ii.  I2.  ■  •  .  1«  ]:  :=  [ui.  U2,  •  - .  u„];  D  :=  diag(  ri,.  ^2.  •  • .  ): 

return  (D  and  the  bottom  halfs  of  L,  U^y, 

S.  Fast  Triangular  Factorization  of  close-to-Hankel  Matrices. 

In  this  section,  we  shall  only  consider  strictly  lower  triangular  displacement  operators 
and  F^,  i.e.,  those  with  zeros  on  the  main  diagonal.  It  will  be  seen  that  the  use  of  such  dis¬ 
placement  operators  greatly  simplifies  finding  minimal  generators  of  Schur  complements.  Such 
displacement  operators  can  be  used  for  Hankel,  Hankel  block  and  block  Hankel  matrices. 

Beriekamp  [1],  [19]  (see  also  [3],  [7])  was  perhaps  the  first  to  describe  a  fast  O(n^)  algo¬ 
rithm  (needs  inner-product  computations)  for  solving  Hankel  matrix  equations;  the  closely 
related  Berlekamp-Massey  algorithm  [19]  is  an  algorithm  of  niillips  [21].  Rissanen  [22] 
extended  the  results  of  Phillips  to  block-Hankel  matrices;  The  Berlekamp-Massey  algorithm 
involves  certain  inner-product  computations,  which  is  a  bottle-neck  for  parallel  evaluation. 
Recently,  following  eariier  work  of  Kung  [14]  and  Citron  [7],  Lev-Ari  and  Kailath  [18] 
presented  another  fast  algorithm  that  does  not  need  inner-product  computation.  The  results  in 
this  section  can  be  regarded  as  an  extensitm  of  the  results  of  Lev-Ari  and  Kailath  [18]  to 
Hankel-block  and  block-Hankd  matrices.  Funhennore,  we  shall  give  a  fto.  algorithm  for  com- 
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puting  the  triangular  factorization  of  the  inverse  of  Hatdtel  matrices. 

Let  (X.  K }  be  a  Hankel  generator  of  a  matrix  with  respect  to  strictly  lower  triangular  dis¬ 
placement  operators  {F^ ,  F* }  that  also  sati^  the  condition  (24).  (Otherwise,  we  assume  that 
the  matrix  has  been  extended  appropriately  such  that  (24)  holds.)  We  say  that  a  Hankel  genera¬ 
tor  is  proper  if,  for  a  certain  i,  all  the  elements  in  the  ith  tow  of  X  and  above,  except  for  the 
element  [X],-,i,  are  zero,  and  all  elements  in  the  ith  row  of  Y  and  above,  except  the  element 
are  zero.  Thus  a  proper  generator  has  the  fonn 


r 

1 

o 

o 

0 -  0  * 

X  =  [Xi,  ■  •  ,  Xp]  = 

4 

l«  *  -  ♦ 

1  1 

II 

II 

Ik, 

*  — 

*  — 

*  — 

t  1 

N  ♦  -  ♦ 

*  - 

*  - 

*  - 

Before  we  show  how  to  convert  a  non-proper  generator  to  a  proper  one,  we  shall  summarize 
the  one-step  Schur  reduction  with  the  following  theorem.  Often  we  shall  denote  a  proper  gen¬ 
erator  as  [Xp ,  Yp }  for  clarity. 


Theorem  5.1.  Let  \ir/jr*yAi-\  =  where 

Then 


=  I  •  • .  y^'“‘^  ]• 


li  =  Ui  *^x^*■-•)],•(F*)^y^'>.  di  =  1/[I.].,  (43) 

and  where 

X(0  .  [(x,(‘-‘)-d.  [xp-'J],  I,),  •  •  ,  x^‘-'>],  (44a) 

y(i)  ,  (yp-l),  y^i-t)^  .  .  ,  (y^^-»)-rf,.(y^*-»JiU..)].  (44b) 

Proof.  (43)  is  immediate  ftom  (22).  Using  (43),  we  have 

F^hdi  =  (y^"'>].d,x^'-'>.  F*u,-  =  ^x^'-'>],■yf‘■-». 

Therefore,  from  (27b) 


(45) 
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Now,  (44)  follows  from  (45).  □ 

We  can  obtain  the  triangular  factorization  of  close-to-Hankel  matrices  by  af^lying 
Theorem  5.1  repeatedly  as  described  below: 

Algorithm  5.1.  r-step  Partial  Triangular  Factorization  of  Close-to-Hankel  Matrices. 

Input:  A  generator  of  A  ■  Aq  e  R"^; 

Output:  Partiai  Triangular  Factors  L  e  R'*’®'  and  U  e  R'’’"; 
for  i  :=  1  to  r  do  begin 

Construct  a  proper  generator  of  A,_j;  /*  See  below.  *! 

Compute  I,-,  u,,  </,•  using  (43); 

Obtain  a  generator  of  A,-  by  (44); 
end 

-  [It.  I2.  •  • .  If]:  ■=  [ui,  U2,  •  • ,  w);  D  diag(di,  ^2.  •  • .  4): 

return  (L,  U.D)-, 

Example  5.1  Hankel  matrices.  One  can  use  Algorithm  5.1  to  find  the  n-step  partial  triangu- 
larization  of  the  extended  matrices  in  (25)  with  any  of  the  two  generators  in  (26);  both  genera¬ 
tors  will  need  the  same  amount  of  computation.  Triangular  factors  of  Hankel  matrices  are 
obtained  from  the  partial  triangular  factors  of  the  extended  matrices.  Also  note  that  Z*  =  Z^ . 

Example  5  J  Block-Hankel  matrices.  The  block  Hankel  matrix, 

Bx  •  • 

Bj  B2  • 

H  m .  e  B,-  g  R***.  (46) 

Bn-\  B^  •  •  B2,_2 

luu  a  law-rank  displacement  with  respect  to  the  block  sMft  displacement  operator  Z^.  How¬ 
ever,  this  displacement  operator  and  the  block  Hankel  matrix  (46)  do  not  satisfy  (24).  We  first 
add  a  block  of  null  rows  and  a  block  of  null  columns  to  (46)  to  get  the  extended  matrix 
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H  e  Notg  that  H  also  has  low  displacement  rank  with  respect  to 

{Z(b+i)6.  Z(,+i)j,1.  One  can  easily  find  a  generator  of  H  with  respect  to  these  displacement 
operators.  Now,  we  can  find  triangular  factors  of  H  from  the  nb  -step  partial  triangular  fac¬ 
tors  obtained  by  using  the  algorithm  5.1. 


Example  5  J  Hankel-Block  Matrices.  Consider  the  following  Hankel-block  matrix  B  and  its 
extended  matrix  B, 


^  ^  "  [o^  oj  "  ffi  - 

The  matrix  B  has  displacement  rank  6  with  respect  to  {Z2„+|,  Z2,+i)  We  can  obtain  triangu¬ 


lar  factors  of  B  from  the  In-step  partial  triangular  factors  obtained  by  using  the  algorithm 


Example  5.4  Inverse  of  Hankel  Matrices.  Let  H  b  be  a  Hankel  matrix.  Define  the 
matrices 

ff  j]  ,  ^  _  f  w  ol  „  „ 

M  s  j  ^  e  M  »  ^  ^  e  rC2"+»>‘(2«+i\  (47) 

where  7  is  the  reverse  identity  (ones  on  the  antidiagonal)  matrix.  The  matrix  M  is  a  Hankel- 
block  matrix  of  the  form  of  B  in  Example  53.  and  the  matrix  M  has  displacement  rank  4  with 
respect  to  [Z^+u  Zin+i).  We  use  the  Algorithm  5.1  to  get  n-step  partial  triangular  factoriza¬ 
tion  cfM. 

Lx 

SI  =  C  z?l  t/,G  01  + 

0^ 

Note  that 

H-'^UiPLi,  1/2 "/C,  LimGl, 
where  and  L\  are  upper-triangular,  because 
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H=LiDUi.  I  =  CDUi,  J=LiDG. 


Example  Inverse  of  Hankel  Matrices.  Instead  of  using  the  extended  matrix  M  in  {41), 
one  can  use  the  following  exunded  matrix. 


M 


H2  0  /ja 

0^  0  0^  e  (48) 

I7.  0  O 

where  H 2  is  as  defined  in  (25b),  and  is  the  2n  x2n  identity  matrix.  The  matrix  M  in 


(48)  has  displacement  rank  2  with  respect  to  [F,  F ).  where  F  =  Z2a  ©z5,+i .  Also  one  can 
check  that  =  XY^ ,  where 

\T 


X  = 


1 

0  • 

•  0 

0 

T 

0 

-ho  • 

0 

*0  • 

•  ^2ii-2 

1 

* 

Y  = 

1 

0  • 

0  0 

(49) 


Note  that  n-step  partial  triangular  factors  of  M  and  the  displacement  operator  F  sati^  (24). 
Therefore,  we  can  use  the  Algorithm  5.1  to  get  n  step  partial  triangular  factorization  of  hf, 
and  obtain  triangular  factors  of 


Construction  of  Proper  Generators. 

The  basic  tool  for  constructing  a  proper  (Hankel)  generator  is  the  use  of  elimination 
matrices  £,  j(ti),  defined  as  the  identity  except  for  the  element  [£, j(ti)],  j  =  t).  Notice  that 
EfJW)  is  also  different  from  the  identity,  except  that  =  -H-  Let  {X,  T)  be  a  non- 

proper  generator  of  A.  Without  loss  of  generality  we  shaU  assume  that  ^  0.  If  not,  we 
can  always  interchange  (implicitly)  columns  of  X  and  rows  of  to  obtain  such  a  generator  of 
A.  We  can  annihilate  all  elements  in  the  ith  row  of  X  except  the  dement  [Xl,,i  by  post- 
multiplying  with  the  n-1  elimination  matrices  pivoting  with  the  dement  [X],- 1, 

X  £lj(tlj)£i3(Tl3)  ’  •  £l3(Tlp).  Y  £rJ(Tl2)EfJ(Tl3)  •  •  £|~J(T1p), 
where  ti»  =  -(Xli;k/[X]i.i. 
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Again  assuming  that  ^  0,  we  can  similarly  annihilate  all  elements  in  the  ith  row  of 
Y  except  the  element  [y],^  by  post-multiidying  with  the  n-1  elimination  matrices; 

where  Yt  = -*[l^lijk/[l'1i.p-  Note  that  the  last  annihilation  £i^  does  not  destroy  the  zero  at 
This  procedure  will  require  2n  elimination  matrices,  and  therefore  20n  flops. 

If  a  matrix  A  is  symmetric,  then  the  matrix  jry4  =  XY^  is  skew  sjmunetric,  and  there¬ 
fore,  has  the  same  number  of  positive  and  negative  eigenvalues.  Hence  the  symmetric  dis¬ 
placement  has  the  form 

=XPX^,  P  = 

where  /s  is  the  S  X  S  reverse  identity  matrix  (Check  the  generators  (26a)  and  (26b)).  We  call 
the  generator  [X,XP^]  skew  symmetric.  With  a  skew  symmetric  generator  [X,XP^)  note 
that  we  only  need  to  apply 

^  £i^(il2)^u(T|3)  ’  ■  ^i.p(np), 

to  obtain  a  proper  generator.  Also,  we  only  need  to  compute  1;  in  (43)  and  X^‘^  in  (44a),  and 
Algorithm  5.1  will  give  the  Cholesky  factorization  A  =  LDL^. 


O  -fs 
h  O 


Remark  5.1.  Algorithm  5.1  for  finding  the  Cholesky  factorization  of  the  symmetric  Hankel 
matrix  in  (2)  by  using  the  generator  (26b)  (choosing  alternative  pivoting  elements)  is  identical 
to  the  Euclidean  algorithm  for  finding 


GCD(p0c),q0c)),  p(x)»Aox^~*  +  -  +/i2«-3* +^2,-2.  -7(x)*x^~*. 
We  encourage  readers  to  check  this  equivalence  by  using  the  3  x  3  Hankel  matrix. 


H  = 


5  3  2 
3  2  1 
2  1  4 


Remark  5J.  Algoridm  5.1  reduces  to  the  algorithm  of  Sugiyama  et.  al.  [24]  if  we  use  die 


generator  (49)  to  find  the  triangular  factorization  of  the  inveise  of  Hankel  matrix.  Also  the 
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O  (n  log^/i)  algorithm  of  Brent  et.  al.  [4]  (see  also  [6])  for  solving  Toeplitz  system  of  equations 
is  closely  related  to  the  divide-and-conquer  version  of  the  algorithm  by  Sugiyama  et.  al.  after 
permuting  the  rows  of  a  Toeplitz  matrix  to  make  it  Hankel.  Furthermore,  Berlekamp-Massey 
algorithm  can  be  regarded  as  the  "Levinson  version"  of  the  above  procedure,  so  that  one  can 
work  with  only  the  bottom  part  of  the  algorithm  applied  to  (49)  (see  [S]). 

6.  Fast  QR  Factorization  of  Vandermonde  Matrices. 

We  shall  show  that  the  Algorithm  S.l  can  be  used  for  the  QR  factorization  of  the  tran¬ 
spose  of  the  Vandermonde  matrix, 

=  [  c,  Afc,  •  ■  ,  ^"-‘c  ].  K=  diag(  kj,  kj.  •  • .  *,)• 

First,  notice  that  the  matrix  a  //  is  a  Hankel  matrix, 

t^Kc  •  c^A:"c 

...  e  R"’*. 

Let  us  define 

Hy  Vy 

o  ^  (50) 

. 

where  Hy  is  as  (25a)  and  v[  =  [  V,  K"c  ].  Note  that 
=  FyZ^^yeK-^ 

t  “  XPX^,  (51a) 

where 

1  0  ••  0  0  •  •  0  I''  -1 

0  •  •  0  1  0  •  •  0  -1 
^  0  —h^  •  ~^2«-a  0  0  •  •  0  ’  ^  1 

0  Ao  •  '  A,_,  *r'  *2'*  •  V*J  I* 

It  is  easy  to  check  that  the  dis|4acement  operators  (F],  Fi)  and  the  n-step  partial  triangular 

factors  of  XI y  satisfy  (24).  Therefore,  we  can  use  Algorithm  S.l  for  die  n-step  partial 
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triangularizadon  of 


Mi  = 

R^ 

0^ 

i 

[R  OQ^  ]  + 

0  0 

0  S 

Q 

Comparing  (SO)  and  (52).  it  is  easy  to  see  that 

W^=R^R,  V^^QR,  Q^Q=l. 


(52) 


Remark  6.1.  Instead  of  using  the  matrix  Mi  in  (SO),  one  may  use  the  extended  matrix 


e 


where  is  as  in  (2Sb).  and 


vlm[V.W].  WsK^V. 

For  the  matrix  M2,  one  can  check  that 

2  =  2.  F2  3  Zi,  ©AT"'. 

where 


'  1  0  • 

. 

•  0  ' 

0  -1 

X  = 

0  ho  ■ 

■  h2K-2  • 

• 

,  p» 

1  0 

7.  Concluding  Remarks. 

We  introduced  some  generalized  ix>tion  of  displacement  structure  and  developed  some  of 
their  properties.  The  displacement  structures  associated  with  Toqrlitz  and  close-to-Toeplitz 
matrices  have  been  the  most  studied  so  far.  with  some  new  results  in  [S].  In  this  chapter  we 
have  focused  on  Hankel  and  close-to-Hankel  matrices,  and  presented  a  general  algorithm  for 
triangular  factorization  of  such  matrices  and  their  inverses.  This  general  algorithm  was  also 
extended  to  obtain  the  triangular  factorizations  of  Vandermonde  and  close-to-Vandermonde 
matrices  and  their  inverses,  and  the  QR  factorizations  of  Vandermonde  matrices  and  dose-to- 
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Vandennonde  matrices.  (The  QR  factorization  of  Hankel  matrices  can  be  obtained  via  the  QR 
factorization  algorithm  for  Toeptitz  matrices  [S]).  Relationships  with  all  eailier  algorithms  for 
these  problems  have  also  been  noted.  We  remaik  that  Algorithm  5.1  can  be  easily  imple¬ 
mented  as  a  divide-and-conquer  fashion.  The  approach  taken  in  [6]  can  be  used  for  this  pur¬ 
pose. 
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Chapter  5. 

Divide-and-Conquer  Solutions  of  Least-Squares  Problems  for 
Matrices  with  Displacement  Structure 

Abstract 

A  divide-and-conquer  implementation  of  a  generalized  Schur  algorithm  enables  us  to 
obtain  (exact  and)  least-squares  solutions  of  various  block-Toeplitz  or  ToepUtz-block  systems 
of  equations  with  <9(a^nlog^/i)  operations,  where  the  displacement  rank  a  is  a  small  constant 
(typically  between  2  to  4  for  scalar  near-Toeplitz  matrices)  independent  of  the  size  of  matrices. 


1.  Introduction. 


In  recent  years,  there  has  been  considerable  research  on  fast  algorithms  for  the  solution  of 
linear  systems  of  equations  with  Toeplitz  matrices.  The  Levinson  and  Schur  algorithms  allow 
solutions  with  O(n^)  floating  point  operations  (flops)  for  systems  with  n  x  n  Toeplitz 
matrices. 

In  1980.  Brent  et  al  [S]  described  a  scheme  for  obtaining  a  solution  with  <9(nlog^n) 
flops.  This  was  based  on  two  ideas  -  the  use  of  the  Gohberg-Semencul  formula  [11],  [12],  [16] 
for  the  inverse  of  a  Toeplitz  matrix,  and  the  use  of  dividc-and-conquer  (or  doubling)  techniques 
for  computing  (generators  oO  the  Gohberg-Semencul  formula. 

Let  X  and  y  denote  the  first  and  last  columns  of  r~*  e  .  Then  if  the  first  component 
of  X,  say  Xi,  is  nonzero,  Gohberg  and  Semencul  [12]  showed  that  we  could  write 

=  -!-[L(x)L^(7.y)  -  L(Z,y)L^(Z,7„x)].  x,  0,  (1) 

•*1 

where  7,  is  the  reverse-identity  matrix,  Z,  is  the  shift  matrix. 


1 

0 

ill 

/ 

1 

.  Z,  = 

1  0 

1 

1  0 

and 

L(v)  =  a  lower-triangular  Toeplitz  matrix  with  first  column  v. 

The  significance  of  (1)  in  the  present  application  is  that  the  product  of  a  vector  and  a  lower-  or 

upper-triangular  Toeplitz  matrix  is  equivaleru  to  the  convolution  of  two  vectors,  which  can  be 

done  using  O(nlogn)  flops  (see,  e.g.  [4]). 

Brent  et  al  used  a  divide-and-conquer  scheme  for  a  certain  Euclidean  algorithm  to  factor¬ 
ize  row-permuted  Toeplitz  matrices  (i.e.,  Hankel  matrices),  and  to  obtain  the  vectors  (x,  y)  of 
the  Gohberg-Semencul  formula  with  O(nlog^n)  flops  [See  [9]  for  the  connection  of  Brent  et 
al’s  approach  to  other  related  results].  Later  Bitmead  and  Anderson  [3]  and  Morf  [20]  used 
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another  approach  based  on  the  displacement-rank  properties  of  matrix  Schur  complements,  to 
obtain  similar  results;  while  this  approach  allows  for  generalization  to  non-Toeplitz  matrices, 
the  hidden  coefficient  in  their  proposed  0(nlo^n)  constructions  turned  out  to  be  extremely 
large  (see  Sexton  et  al  [24]).  Later  Musicus  [21],  de  Hoog  [10],  Ammar  and  Gragg  [2]  used  a 
more  direct  approach  based  on  a  comtxnation  of  the  Schur  and  Levinson  algorithms  to  obtain 
better  coefficients;  in  particular.  Ammar  and  Gragg  made  a  detailed  study  and  claimed  an 
operation  count  of  Sn  log^/i  hops.  With  this  count,  the  new  (called  superfast  in  [2])  method 
for  solving  (exactly  determined)  Toeplitz  systems  is  faster  than  the  one  based  on  the  Levinson 
algorithm  whenever  n  >  256.  We  should  mention  here  that  Schur-algorithm-based  methods 
are  natural  in  the  context  of  transmission-line  and  layered-earth  models,  so  it  is  not  a  surprise 
that  similar  techniques  were  also  conceived  in  those  fields  -  see  Qioate  [7],  McGary  [19]  and 
Bruckstein  and  Kailath  [6].  A  good  source  for  background  on  the  Levinson  and  Schur  algo¬ 
rithms,  transmission  line  models,  displacement  representations  as  mentioned  and  used  in  the 
present  chapter  may  be  [13]. 

The  method  we  have  taken  in  this  chapter  is  in  the  spirit  of  the  generalized  Schur  algo¬ 
rithm  [8].  Our  algorithm  can  be  applied  to  non-Toeplitz  matrices,  and  does  not  have  the  draw¬ 
back  of  the  large  coefficient  in  the  methods  of  Bitmead  and  Anderson  [3]  or  Morf  [20].  Furth¬ 
ermore.  we  can  readily  handle  matrices  such  as  (T^T)”’  and  (r^r)“*r^,  where  T  may  be  a 
near-Toeplitz  matrix  or  a  rectangular  block-Toeplitz  matrix,  or  a  Toeplitz-block  matrix;  in  par¬ 
ticular,  therefore,  we  can  also  obtain  the  least-squares  solutions  of  over-determined  Toeplitz 
and  near-Toeplitz  systems  with  (7(nIog^n)  flops.  Our  algorithm  is  closely  related  to  the  algo¬ 
rithm  of  Musicus  [21].  However,  our  presentation  is  conceptually  much  simpler  (especially  for 
the  non-Toeplitz  cases  treated  in  [21])  than  previous  approaches;  in  particular,  we  do  not  use 
the  relationship  between  the  Schur  algorithm  and  Levinson  algorithms  needed  in  [2],  [10],  [21]. 


An  outline  of  our  approach  is  the  following.  For  a  matrix  E, 


^1.1 

^2.1  ^2^ 


E 1 1.  nonsingular. 


the  Schur  complement  of  £,  j  in  £  is 

^  ^  ^  1.1  ~  ^  2.1^  r.i^  u- 
Notice  that  matrices  such  as 

Si  =  7~*.  S3«(r^rr*r^  (2) 

can  be  identified  as  the  Schur  complements  of  the  following  extended  matrices. 

T  l]  T^T  I  jTj  jT 

-I  0  '  -I  O  '  -I  o  ■ 

Now  the  matrices  £  in  (3)  have  the  following  (generalized)  displacement  representation,  for 

suitably  chosen  matrices  [F^ ,  F* }, 


E  =  l£(x..F>')^r^(yi.F*), 

i-t 

where  ATCx,-,  F'^)  and  £(yi,  F*)  are  lower  triangular  matrices  whose  j  columns  are  (F^ 
and  (F*)^“'V,  ,  respectively.  The  smallest  possible  number  a  is  called  the  displacement  rank 
of  £  with  respect  to  [F^ ,  F*).  For  an  example,  let  T  be  an  m  x  n  scalar  Toeplitz  matrix, 
with  m  2  n.  Then  the  matrix  £2  has  displacement  rank  4  with  respect  to  {F,  F),  where 
\z„  O] 


Q  2  .  and  has  a  displacement  representation  [14], 


2  4  /,  O 

E2=j:K(ji,F)K^iXi,F)-l,K(yi,F)K^(,Xi,F).  y.  ■  ^  ,  x,.  (4a) 

i-l  i-3  L  "  . 

If  we  define  xl  ■  (w/^,  v,^],  note  that  the  matrix  £(x,  ,  F)  in  (4a)  has  the  form 


Liyfi)  O 

where  £(w/)  and  £(Vj)  are  lower  triangular Toq>litz  matrices  with  first  columns  w,-  and  v,-. 

Given  a  displacement  reptesottation  of  £,  we  use  a  certain  generalized  Schur  algoridm 
(see  Sec  2)  to  successively  compute  displacement  representations  of  the  Schur  complements  of 
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all  the  leading  principal  submatrices  in  E.  For  the  above  example,  n  steps  of  the  generalized 
Schur  algorithm  will  yield 

^  f)  -  £a:(u..  f)a:^(u..  f), 

^  )  J  ,=1  1-3 

where  the  top  n  elements  of  u,-  are  zero.  Therefore,  if  we  denote  the  bottom  n  elements  of  u.- 

as  U2J .  we  can  have  the  displacement  representation 

(T^T)-^  =  ZL(U2.,)L^(U2^)-  iL(U2..)L^(U2^). 

«»1  i“3 

Now.  the  generalized  Schur  algorithm,  which  is  a  two-term  polynomial  recursion,  can  be 
implemented  in  a  divide-and-conquer  fashion  with  0(o?f(n)lo^)  fltqjs,  where  f(n)  denotes 
the  number  of  operations  for  the  multiplication  of  two  polyrnmials.  Therefore,  if  the  multipli- 
cadon  of  two  polynomials  is  done  again  by  divide-and-conquer,  i.e..  by  using  fast  convolution 
algorithms,  then  the  overall  computation  requires  0(a’nlog^«)  flops.  We  remark  that  the  fac¬ 
tor  can  be  reduced  to  a  if  several  convolutions  can  be  performed  in  parallel.  Once  we  have 
a  displacement  representation  of  the  desired  Schur  complement  5,  the  matrix-vector  multiplica¬ 
tion,  5b.  can  be  done  with  O(ouilogn)  flops  using  fast  convolutions.  As  an  example,  we  can 
obtain  the  least  squares  solution  for  the  Toeplitz  system, 

rx  =  b.  r  e  m^n 

as  follows: 

(i)  Form  T^b  using  2  fast  convolutions. 

(ii)  Obtain  a  displacement  representation  of  using  the  divide-and-conquer 

version  of  the  generalized  Schur  algorithm, 

(iii)  Form  (X^Ty^(X^h)  using  8  fast  convolutions. 

If  we  had  obtained  the  displacement  representation  of  directly  (using  £3),  then  stq) 


(i)  above  would  not  be  needed. 
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2.  Generalized  Schur  Algorithm. 

After  a  brief  review  of  basic  concepts  and  definitions,  we  shall  describe  the  generalized 
Schur  algorithm  of  references  [8]  and  [14],  but  in  a  polynomial  form  important  for  the  divide- 
and-conquer  implementations.  We  shall  need  to  recall  some  definitions  and  basic  properties. 

Generators  of  Matrices. 

Let  and  f  *  be  nilpotent  mturices.  The  matrix 

is  called  the  displacement  of  A  with  respect  to  the  displacement  operators  [F^ ,  F*' }.  Define 
the  (F^ ,  f^l-displacement  rank  of  A  as  rank[V^^/j5.*y4  ].  Any  matrix  pair  [X,Y]  such  that 

=  XY^,  .Y  3 [ xi, X2. • . . *o ].  y  yj. yz.  •  • . yo ]  (5) 

is  called  a  (vector  form)  generator  of  A  with  respect  to  (F^,  F* ).  The  generator  will  be  said 
to  have  length  a  If  the  length  a  is  equal  to  the  disf^acement  rank  of  A ,  we  say  that  the  gen¬ 
erator  is  minimal.  A  generator  such  as  K  -  YZ,  where  £  is  a  diagonal  matrix  with  1  or  -1 
along  the  diagonal,  is  called  a  symmetric  generator. 

The  following  Lemma  [14],  [IS]  establishes  the  connection  between  generators  and  dis¬ 
placement  representations. 

Lemma.  Let  F  be  an  m  x  n  matrix.  If  F^  and  F^  are  nilpotent.  then  the  equation 

=  *>4s  the  unique  solution  E  where 

1  1 

Y(x.,  F/)*[x.,  Ffxi.  ■  • ,  F/<"-‘>x,]  and  Y(y.-.  F^My..  F^y.,  ■  • ,  F^<-'>y.-]. 

Choice  of  Displacement  Operators. 

The  generalized  Schur  algorithm  operates  with  generators,  and  needs  0{tsmn)  flops  for 
sequential  implementation  and  Oit^nXof^n)  for  divide-and-conquer  implemoitation.  Therefore, 
for  a  given  matrix  A ,  we  should  try  to  choose  the  disi^acement  operators  that  give  the  smallest 
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OL  If  the  matrix  A  is  an  n  x  n  Toeplitz  matrix,  the  appropriate  displacement  operator  F  is  Z,. 
an  n  X  n  shift  matrix.  If  A  has  some  near-Toeplitz  stnicture,  then  F  would  have  fornis  such 
as 


F  —  Z,  0Z„,,  F  —  ©Z„  ,  F  —  Z,®, 


,  and  0  denotes  the  concatenated 

i>i 


direct  sum. 


where  0  denotes  the  direct  sum, 


Z,  O 
O  Z„ 


Example  1.  Let  T  =  (ti.j)  be  an  m  x  n  pie-  and  post-windowed  scalar  Toeplitz  matrix,  i.e., 

r,j  =  0  if  j  >  i  or  i  >  m  -  n  +  j  with  m  >  n.  Then  it  is  easy  to  check  that  the  matrix 

C  ~(ci.j)sT^T  is  also  a  (unwindowed)  Toeplitz  matrix,  and  with  respect  to 
{Z„  ©Z, ,  Z,  ©Z„, ) ,  £3  in  (3)  has  a  generator  {X,  T )  of  length  2,  where 

*1  =  (co.  Cl,  • ,  c,,  -1,  0,  •  • ,  Oflco^, 

*2  =  [0.  Ct.  •  •  .  c,,  -1,  0,  •  • ,  OJ^/cJ'^, 

yi  =  [cq.  c,,  •  • ,  c,,  to,  fi,  •  •  0,  •  •  0  f /co*'^, 

y2  =  40,  Cl,  •  • ,  c,,  to,  r,,  •  •  r„_,,  0,  •  •  0  □ 


Example  2.  If  T  is  a  Toeplitz-block  matrix,  i.e.. 


T  * 


l.l 

T'u 

■ 

'2.1 

T22 

e  R"**”,  Tjj  =  scalar  m,  x  nj  Toeplitz  matrix. 


L^w.i  ■  Zif/n  J 

then  for  the  matrices  E  in  (3),  we  choose  [8],  [14]  the  following  displacement  operators 


(6) 


M  N 


Ev 

Ff  =  [©Z^leFj. 

F*  =  [©z,^]©£i,  /n=n, 

•■I 

(7a) 

Ei. 

N 

pf  *  [©Z<.,]©£i. 

I«1 

F*  =  [©Z^]©Fi,  m*n. 

(7b) 

£3: 

pf  =  [(©z^]©f  1, 

/■I 

H  M 

F*  =  I©Z^]©[©2;,,]. 

|«1  ^  tm\  ^ 

(7c) 

N 

where  Fi  can  be  either  Z,  or  However,  for  the  divide-and-conquer  implementation,  we 
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N 

prefer  to  choose  ®Z„/,  see  the  Remark  in  Sec  4. 

i=l  * 

Example  3.  On  the  other  hand,  if  the  matrix  7  in  (3)  is  a  block-Toeplitz  matrix  with  3  x  p 
blocks, 

Bq 

Bj  Bq  ■  .« 

r=  .  ...  e  R"’".  m*Afp,  n«iVp.  (8) 

,  Bm-\  Bm-2  •  fi-W+*#  . 

then  for  the  extended  matrices  E ,  we  should  choose  [8]  the  displacement  operators 

F/=2/©Z/,  F‘>=Z,»®ZI  (9) 

where,  for  £]  we  assumed  that  7  is  a  square  n  x  n  matrix. 

Generators  of  the  above  and  other  extended  Uock-Toepliiz  or  Toeplitz-bkx:k  matrices  can 
be  found  in  [8]  and  [14]. 


Polynomial  Form  of  Generators. 

In  general,  the  displacement  operators  F^  and  F*  for  both  extended  block-Toeplitz 
matrices  and  extended  Toeplitz-block  matrices  have  the  form. 


N  ti 

F  =  ©Zj^,  nm^ni. 


We  shall  say  that  the  di^acemem  operator  F  in  (10)  has  N  sections.  One  of  the  key  opera¬ 
tions  in  generalized  Schur  algorithms  is  matrix-vector  multiplication,  Fv,  i.e.  a  sectioned  shift 
operation.  With  the  polynomial  lepresmitation  of  vectors,  the  shift  operation  has  a  nice  alge¬ 
braic  expression.  For  a  given  vector  v,  let  v(z)  denote  the  polynomial  whose  coefficient  for 
the  term  z'  is  the  t-i-lst  component  of  the  vector.  i.e.. 


V  =  [Vo,  V,.  V2,  •  •  ,  V*_lf  V(Z)  »  Vo  +  VjZ  +  vit^  +  •  •  v,_,z""^  (1 1) 

Then, 


Z.V  ■  v' »  (0,  Vo,  vj,  •  • ,  v,_2]^  -»  v(z)z  mod  z" . 
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In  general,  for  the  matrix  whose  displacement  operator  is  the  F  in  (10).  let  us  define 
integers  (8,  )  by 

i 

5,  =  8i  <  82  <  •  •  <  8v. 

Let  v(z)  and  6(z)  be  polynomials  of  degree  less  than  or  equal  to  n-1,  and  define  the  degree  at 
most  (rt,-l)  polynomial,  Vj(z),  by 

v(z)  =  v,(z)  +  2^2(2)  +  z**V3(2)  +  •  •  +  (12a) 

Given  two  polynomials  v(z)  and  0(z).  and  the  displacement  operator  F  in  (10),  the  (polyno¬ 
mial  form)  displacement  operator  0/r  is  defined  by  the  following  operation, 

v(z)0f  0(z)  *  r(z)  s  r,(z)  +  2^2(2)  +  z^jiz)  +  •  •  +  z^^'V^^fz),  (12b) 

where 

/•,(z)  *  V, (2)6(2^  mod  z"*,  (12c) 

i.e.,  /■,  (z)  is  the  polynomial  v,  (z)0(2^  after  chopping  off  the  hi^r  degree  tenns,  so  that  r,  (z) 

has  the  degree  at  most  (n,-  -  1). 

Let 

X  =  [x,,  X2,  •  • ,  xj,  y  =  [y,,  Jz,  •  • ,  yj 
be  a  generator  of  a  matrix  A  with  respect  to  certain  (F^,  F^ },  and  let 

x,-4x,(z),  yi->y,(w). 

Then  we  call  the  pair  of  polynomial  vectors,  (X(z),  ^(h')),  where 

^(2)  ■  I  Jti(2).  xjKz),  •  • ,  Xa(z)  1,  y(w)  ■  [  yi(w),  y2(>v),  •  • ,  ya(w)  ], 
a  (polynomial  form)  generator  of  A .  with  respect  to  (polynomial  from)  displacement  operator 

{©f/.  ©f*)- 


Example  I  (Continued).  The  matrix  £3  in  (3)  has  a  generator  (X(z).  y(w))  with  respea  to 
{ ©f/.  ©f * ).  where  F^  ■  Z,  ®Z, ,  F*  »  Z,  ,  and 
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Xiiz)  =  [Co  +  C,z  +  •  •  +  C,z"  - 2"'^‘]Co‘'^, 

X2iz)  =  [CiZ  +  C2^  +  •  •  +  C.z"  -  z"'^*]Co''^, 

yiCw)  =  [Co  +  C,w  +  •  •  +  c,w"  +  foH'"*'  +  +  •  •  + 

y2(>‘')  =  -[CjW  +  •  •  +  C^w"  +  fow""^*  +  +  •  •  + 

Also  notice  that 

Xliz)®p/2  =  [CoZ  +  c,2^  +  •  •  +  c,_ir"  -  z"'^^]Co‘'^. 

yi(w)®ptw  = 

=  [cow  +  ciw^  +  •  •  +  Ca.jw"  +  tow''*^  +  +  •  •  +  □ 

Next  we  note  that  for  given  vectois  a  and  b  such  that  a^b  ^  0.  we  can  always  find  [8] 
matrices  6  and  4*  such  that 

a^e  =  [a,'.0.0.  •  •  ,01.  b^'P  =  [b,'.0.0.  •  •  .0).  6-4'^  =/.  (13) 

and  therefore,  a^b  =  a  I'bi'.  We  define  polynomial  matrices  6(z)  and  4'(w)  by 


^2 

w 

1 

1 

0(2)  *0 

\ 

,  4'(w)  ■  4* 

\ 

1 

1 

We  remark  also  that  if  a  =  b,  then  '¥(w)  s  &(w),  and  if  b  =  £a,  where  £  «  /p  0-/,.  then 
4'(w)  =  Sfw)!,  so  that  we  only  need  to  find,  and  post-multiply  by,  6(z). 


Generalized  Schur  Algorithm 

Let  a  matrix  E  have  a  generator  (XqCz),  Y^lw)]  with  respect  to  (®|./,  0jc-»).  and 
define  Eij  by 


^1,1 

£»  P  €  R"”". 

Zl2,|  Cy, 

where  £|  t  is  a  k  xk  strongly  nonsingular  matrix,  i.e.,  the  one  with  all  nonsingular  leading 


submatrices.  The  it -step  generalized  Schur  algorithm  [8],  [14]  presented  below  in  polynomial 
form  gives  a  generator  of  the  matrix. 
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with  respect  to  {0^/.  ®f^],  or  equivalently,  a  generator  of  5  with  respect  to  {0/'/,  0/r»). 
where  and  F*  denote  the  trailing  square  sulxnatiices  of  size  (#n  -  k)  and  (n  -  k)  of 
and  F**,  respectively. 

Algorithm  (l:-step  Generalized  Schur  Algorithm) 

Input:  Generator  of  E,  [X(^z).  Kofw));  displacement  operator  {0^^/.  0/p*}'. 

Number  of  steps  k. 

Output:  Generator  of  S  {X*(z).  l*(w)} 

Procedure  GenetalizedSchur 
begin 

for  {  :s  0  to  ik  -  1  do  begin 
a^  :=  [2-‘X,(z)],^; 
b^  :=  [z-^Yi(t)],^ 

Find  6,-(z)  and  'P,(w)  to  tiansfoim  and  b^  such  as  (13); 

X.>,(2)  =  X,(r)0f^e,{r);  Y^fyu)  =  F.(w)0^^'Pi(w) 
end 

return  {X*(2),  r*(»v)) 
end 

Remark.  The  polynomial  vectors.  X,(2)  and  /.(tv),  have  degrees  m-1  and  «-l  reqjectively, 
for  all  /.  Each  step  eliminaies  the  ntm-zeio  lowest  degree  tenn,  and  therefore  the  terms  of 
Xiiz)  and  y.-Cw)  whose  degrees  are  less  than  z*  and  >v‘  are  zeros. 

By  api^ying  the  generalized  Sdnir  algorithm,  one  can  obtain  generators,  or  equivalently 
displacement  representations,  for  various  interesting  Schur  comfriements. 
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3.  Divide-and-Conquer  Implementation. 

The  (sequential)  k-step  generalized  Schur  algorithm  in  Sec  2  can  also  be  implemented 
efficiently  using  divide-and-conquer  approach.  We  shall  only  explain  how  to  find 
essentially  the  same  argument  applies  for  I*(w). 

Let  us  define  Sp:q(.z)  and  Xp.^(z)  by 

Sp,^(z)  a  ep(z)0p+i(z)  •  •  e,(2). 

Xp._^(z)^Xo.q(z)®p,eoq,-iiz).  Xo._^(z)mX^z)  mod 

where  0  5  p  ^  q.  The  polynomial  matrix  0pjf(z)  has  a  degree  q-p+l-  The  polynomial  vec¬ 
tor  Xp.^iz)  has  degree  q,  and  is  obtained  by  dropping  from  Xp(z)  all  terais  of  degree  higher 
than  2*' .  Also  note  the  useful  properties, 

(J:  (z  )  ®  p  0i(z  )]  ®/r  02(2  )  =  x(z  )  (0,(Z  )02(z  )], 

[x,(z)  +  X2(z)l®f  0(z)  =  [JCi(z)©f  e(z)]  + 

These  properties  and  the  faa  that  Qp-qiz)  is  completely  determined  by  Xp._qiz)  allow  a  divide- 
and-conquer  implementation  of  tt^  generalized  Schur  algorithm. 

Given  Xp._q(,z),  we  can  comixite  9p.^(z)  as  follows.  If  p  =  p,  then  we  arc  successful, 
and  compute  0p,p{z)  -  0p{z).  Otherwise,  we  choose  an  "appropriate"  (see  Sec  4)  division 
point  r  such  that  p  <  r  <  q,  and  try  to  solve  the  smaller  sub-problem  of  finding  Spa-rCz). 
given  X^^_i(z).  Once  we  know  0p.r-\(z),  we  can  compute  X^.^iz)  by 

Xr.q(z)  =  Xo.^(.z)®p,0o.^.i(z)  =  [Xo:,(z)®f,0(,.^_,(z)l®^/ep.^_i(z)  (I5a) 

=  Xp.^(z)®pfep.^.i(.z).  (15b) 

Now  we  again  try  to  find  0,;,(z)  given  X,.;,(z).  After  we  obtain  0r;,(z).  we  can  combine  the 

two  results,  0;,.,_i(z)  and  0,:,(z),  by  multiplication. 

e;,-.,(O  =  e^,.,(z)0,.^(z).  (lb) 

Programming  details  of  the  above  recursive  generaUzed  Schur  edgoridun  are  shown  in  die 


Appendix. 
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The  previous  recursive  description  can  be  visualized  nonre  ursively  using  trees  (see  Fig  1 
and  2).  Each  node  in  the  tree  is  annotated  with  the  rules:  "find”,  "apply"  and  "combine", 

fp,p  ;  Find  %:p{z), 

Op:<,  '■  X^u,(2)  :=Xp._^(2)®F%.r-l(.z)> 

We  traverse  the  tree  in  post-order  (i.e.,  follow  the  order  labeled  on  each  node  of  the  tree),  and 
evaluate  the  rules. 

Now,  we  shall  consider  two  examples  in  detail. 


Example  4.  Pseudo-Inverse  ofpre  and  post  windowed  Toeplitz  Matrices. 
Consider  the  matrix  £3  in  Example  1,  where 


16 

8 

4 

1 

3 

2 

1 

1 

-1 

0 

0 

0 

8 

16 

8 

4 

0 

3 

2 

1 

1 

-1 

0 

0 

4 

8 

16 

8 

.  r^  = 

0 

0 

3 

2 

1 

1 

-1 

0 

1 

4 

8 

16 

0 

0 

0 

3 

2 

1 

1 

-1 

It  is  desired  to  find  a  displacement  representation  of  (X^T)~^T^.  This  can  be  done  by  the  4- 
step  recursive  generalized  Schur  algorithm.  The  input  to  the  algorithm  is  a  generator 
{Xo(z),  ro(w)}  of 


j'T  •j*  •pT 

-/  o 


with  respect  to  ®f*).  where  £‘^=Z,©Z„,  F’’ The  ouqnit, 

[X^{z),  Y^(}v)]  is  a  generator  of  ,  with  respect  to  {®z;_,  ®z;,l-  The  computational 

sequence  is  illustrated  in  Fig  1,  where  it  is  assumed  that  the  division  points  were  chosen  suc¬ 
cessively  by  2,  1  and  3. 


1  0 

z 

(1).  fwy  eoK)(z)  = 

0  -1 

1 

because  =  (4, 0] 


(2).  ao;i:  X i.i(z )  =  Xo.i(z  )®^/0(wj(z )  =  [4  +  2z ,  2z ] ® jf/0tM)(* )  =  (4z ,  -2z  ] 
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1  +1/2 

z 

-1/2  -1 

1 

(3).  /,,,;  e,.,(z)  =  ^- 


(4).  Co;i:  0O:l(z)  -  0O;o(z)0|.,(z)  = — 


-z/2 


V3  1  J 

(5).  ag-y.  ^2:3(2)  =  ^O:3(z)0f/0O;t(z)  =  +  3z^/2, -z^/4] 


(6)-  /2:2-  02:2(2)  = 


1  0 

z 

0  -1 

1 

because  X2a(z)  =  -^  [32^  0] 
v3 


(7).  a2y.  XjjCz)  =  A'2;3(z)®;./02:2(z)  =  :^•[3z^  z^M] 


(8)-  / 3:3-  03:3(2  )  = 


12 


1  1/12 


>ri43 

(9).  C2;3:  02:3(2)  =  02:2(2)03:3(2)  = 


-'J 

[ 

12 

Z/I2I 

z 

VT43 

[  1/12  1  J 

1 

(10).  C0.3:  00:3(2)  =  00:1(2)02:3(2)  = 


24 


V3VI^ 


z‘‘-z2/24  2^/12-2/12 

-z^/12+z/12  -z^/24+1 


(11).  agy.  X4,j(z)  =  [4+22+zhz^/4-z*/4.  2z+z2+z^/4-z^/4]0^/eo.3(z) 

=  [(4+2z+22+z3/4,  2z+z2+z3/4)  -  z\l/4,  l/4)]0^/eoj(z) 

=  -z^[(l/4.  1/4)©o.3(z)  mod  z*] 

=  -:^=^=[2/12-z2/24-z^/2.  1-2/2-z2/24+z3/12] 

Because  T^T  is  symmetric,  4'o:3(w)  =  0o:3(H')r,  where  Z  =  1©-1,  and  therefore. 

I"4:i3(»*')  =  [(4+2z+24z’/4>t-z^3/4+z/2+z^/4-z^/4), 

(2z  +z  ^+z  ^/4>+z  “(3/4+2  /2+z  ^/4+z  ^/4-z  “/4)1 0^*  0o.3(w  )Z 
2^6 

=  :^=^=[l/4z+z^/24-3z^/2+49z“/24+l  lz®/8+13z®/24+3z^/2. 

-3-z  /2+z^/8-2z^/3+l  lz“/8-13/24r5-z‘‘/8-z‘^/12]. 


Therefore, 


(jTjyXjT  ^  {yx)  +  £.(x2)i'^(y2)l.  Y= 

where  L(x,)  and  Z(y,)  are  the  lower  triangular  Toeplitz  matrices  whose  first  columns  are  x,- 
and  y, ,  respectively,  and 

*1  =  [0.  -1/12.  1/24,  l/2f . 
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X2  =  [-1,  1/2.  1/24,  -1/12]^, 

y,  =  [0,  1/4,  1/24,  -3/2,  49/24,  11/8,  13/24,  3/2]^. 

y2  =  [-3.  -1/2,  1/8.  -2/3,  1 1/8.  -13/24,  -1/8,  l/12f .  □ 


Remark  I.  For  a  symmetric  generator  of  length  2  with  P  =  1,  the  2x2  polynomial  matrix 
9(z)  in  (14)  can  have  the  forni  (hypertx)lic  reflection) 


Let 


e,(^)  = 


chjZ 

-shiZ 


shj 

-chi 


ch^  -  sh^  =  1. 


0p:„(2)  =  ep(z)0p^,(z)  -  •  •  0,(Z)H 


0l.l(z)  01.2(z) 
02.l(z)  02.2(z) 


Then,  by  induction,  one  can  easily  prove  that 


z^-P^'0,.j(z-')  =  (-l)’-'’^‘02^(z),  z’-^’^'0,.2{2-')  =  (-l)’~^^'02,i(z). 
Therefore,  we  need  to  compute  and  store  only  two  entries  of  0p.,(z). 


Remark  2.  For  an  unwindowed  scalar  Toeplitz  matrix,  the  matrix  £2  (3)  displacement 

rank  4,  whereas  the  matrix  £3  has  displacement  rank  5.  Therefore,  when  we  solve  Toeplitz 
least  squares  problems,  it  is  more  efficient  to  find  a  displacement  representation  of  {T^Ty^ 
rather  than  of  {T^Ty^T^ .  With  the  notation  in  (4),  the  matrix  £2  for  an  unwindowed  scalar 
Toeplitz  matrix  T  =  (r,_y)  e  R""*  (m  >  n)  has  a  generator  [14], 

W,  =  r^ti/llt,ll,  W2  =  t2.  W3  =  Z,Z,^Vi.  W4  =  Z,1, 

ft  *  f^O*  fl»  ’  '  •  /m-|]  »  ^2  “  f-I*  '  ’  >  fl-dJ  '  ^  ®  f/m-l*  ‘  ‘  »  • 

V,  =  V3  =  e,/llt,ll,  V2  =  V4  =  0. 

where  1 1- 1 1  denotes  the  Euclidean  norm,  and  e|  is  the  vector  with  1  in  the  flist  position,  and 
zeros  elsewhere. 


Example  5.  Displacement  Representation  for  the  inverse  of  a  Sylvester  Matrix. 


Let  T  denote  the  following  Sylvester  matrix. 
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T  s 


2  0  0  1  0 
1  2  0  2  1 
3  12  12 
0  3  111 
0  0  3  0  1 


(17) 


and  suppose  that  it  is  desired  to  obtain  a  displacement  representation  of  Then  the 


appropriate  extended  matrix  is 


T  / 

-/  O  ’ 


(18) 


and  it  is  easy  to  see  that  the  following  {Xo(z),  To('*'))  is  a  generc^or  of  with  respect  to 


{0f/.  0/r*}.  where  F-''  F'>  -Zi®Z2®Zi, 


■^o(z  )  -  [j:  i(2  ).  X2(z  ),  X 3(z  )],  To(w  )  s  [y  ,(w  ),  y  jfw  ),  y  3(h'  )] 

Xi(z)  =  2  +  z  +  3z^  -  z^  X2(z)  =  1  +  2z  +  z^  +  z^  -  z®,  X3(z)  =  1,  (19a) 

)’i(w)=l,  y2(w)  =  w^  y3(w)  =  w*  (19b) 

Now  the  5-step  recursive  generalized  Schur  algorithm  gives  a  desired  generator  of  T"*,  with 

respea  to  {Z5,  Zj),  and  a  possible  computational  sequence  is  shown  in  Fig  2,  where  the  divi¬ 
sion  points  are  chosen  successively  as  2,  1,3  and  4. 


’  z  -1/2  -Ml' 

w  0  0 

(!)•  f  ®0K)(z)  = 

0  1  0 
.0  0  1 

.  ‘J'0:0(w)  = 

wll  1  0 
wtl  0  1. 

(2).  Xi:i(z)  =  (2z.3z/2,-z/21.  r,;,(w)  =  [w,  0.  0] 


z  -3/4  -1/4' 

w  0  0 

(3).  /i:i:  eH(z)  = 

0  1  0 
.0  0  1 

.  '*'t:t(>*')  = 

3w/4  1  0 
-w/4  0  1 . 

z^  -3Z/4-1/2  z/4-1/2' 

0  0 

(4)*  ®0:l(2)  = 

0  1  0 
.  0  0  1 

.  '*'o:l(>*’)  = 

w^/2+3w/4  1  0 
w^/2-w/4  0  1 

(5).  00:4=  ^2:4(z)  =  [2z^+z’+3z\-5z^/4-5zH-5z^/4+3z’/4] 

yiA(y^)  =  l'O:4(»«')0f»*Fo:l(w) 

=  [(1,  0,  0)'Fo.j(w)  mod  w’]  +  w,  0)'Fo.i(w)  mod  w*] 
=  (w*+3w^/4,  w’,  0] 


'z  5/8  5/8' 

w  0  0 

(6)-  /2:2’  ®2a(*)“ 

0  1  0 

^2a(w) » 

-5w/8  1  0 

.0  0  1  . 

.-5w/8  0  1. 
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(7).  a2:4--  X3.4(2)  =  12z4z'‘.  -Sz^/S+lSz^/g,  llz^S+lSz^/g], 
y^Aiw)  =  J'2,4(w)0^»4'2:2(vv) 

=  l(w\  0.  0)'P2:2(m')  mod  w^]  +  w\0wl4,  1,  0)'P22(w)  mod  w^] 
=  [-5w‘’/8.  w^.  OJ 


0  1  o' 

‘-16h;/5  1  0 

(8)-  /3:3-  ®3:3(2)~ 

2  16/5  n/5 

.  'J'3:3(>*')  = 

w  0  0 

0  0  1  . 

.-11h>/5  0  1 

(9).  a3,.4:  XU2)  =  hSz*/S,7z\6z%  y4:4(w)  = 

[w^  -5w‘‘/8,  0] 

[  z/(2V2)  28/(5V2)  6/5 

w/(2V2) 

5/(1 6V2)  0 

3 

II 

-5z/(16V2) 

1/(2V2)  -3/4 

‘*'4:4(W)  = 

-28w/(5>/2) 

1/(2V2)  0 

0 

0  1  . 

-I2V2W/5 

0  1 

After  evaluating,  C3.4,  C2;4  add  <^o;4*  we  obtain  0o;4(z)  and  'I'o:4(w),  and  Anally 


(14).  do:9‘  >^0:9(2)  =  (■'1(2 )•  ^2(2)- JC3(2)l©f/Qo:4(z) 

=  z’[(-l.  -Z^  O)®^/0O;4(Z)] 

=  Z^[(-l. -Z^  O)0O:4(?)  mod  2^]  =  z*[u,(z),  «2(2).  “3(z)]. 

where 

«,(z)  =  -z/(2V2>-z2/(2>^)+2^/V2  +  z^/>/2 

«2(2)  =  4/(5V2)  +  4z/'l2  +  1622/(5^2)  -  28z’/(5V2)  -  2Sz*/(5^) 

U3(z)  =  2/5  +  z/5  +  2z^/5  +  z^/5  -  6z*/5. 

>"o:9(w)  =  l>l(w).  y2(w).  y3(w)]®;r»^0:4(w) 

=  >v^[(0,  0,  l)0f*'Fo:4(w)]  =  w’IviCh'),  V2(>v),  VsCw)], 

where 

v,(w)  =  -12'^w/5  +  12w2/(5V2)  +  12w^/(5V2)  -  I2w^/(5V2), 

V2(w)  =  -w/VI  +  w2/(2V2)  +  h'^/(2V2)  -  w*/(2'l2). 

V3(W)  =  1. 

Therefore, 

r-‘  =  Z.(Ui)Z.^(V,)  +  Z,(U2)L^(V2)  +  £.(U3)I.^(V3). 

where  u,-  and  v,  are  the  vectors  whose  yth  component  is  the  coefficient  of  and  >v^~'  of 


□ 


II,  (z)  and  v,(w),  respectively. 
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Remark  1.  If  we  had  chosen  the  displacement  operator  F^  =  Z^^^Z^QZ^ 

for  the  matrix  T  in  (17)  we  would  have  the  same  generator  (19)  for  £i,  but  the  obtained  gen¬ 
erator  of  7^*  would  be  the  one  with  respect  to  [Z^QZ^,  Zj)  rather  than  with  respect  to 
{Z5,  Z5}.  The  displacement  ranks  of  7^'  with  respect  to  both  displacement  operators  are  2, 
but  the  above  procedure  gives  non-minimal  generators  of  length  3. 


Remark  2,  The  following  extended  matrix 


T  b 

-/  0  I  •  ^  ~  tnatrix  (20) 

also  has  a  displacement  rank  of  3.  One  could  as  well  obtain  the  solution  7^*b  directly  by 
applying  the  recursive  generalized  Schur  algorithm  to  (20);  the  last  column  of  X,  where  {X.  y) 
is  the  computed  generator  of  7“'b  with  respect  to  {Z„,  1),  can  be  shovm  to  be  the  solution 
7~’b. 


4.  Polynomial  Products  with  Fast  Convolutions. 

The  produa  of  two  polynomials  of  degree  dj  and  d2  can  be  performed  efficiently  using 
d  3  di+d2+l  point  fast  cyclic  convolution  algorithms  [4].  Among  others,  fast  Fourier 
transformations  (FFT's)  can  be  used  for  convolutions,  and  Ammar  and  Gragg  [2]  carefully 
examined  the  use  of  FFT’s  for  a  doubling  algorithm  for  square  Toeplitz  systems  of  equations. 
We  shall  oidy  consider  the  subtle  complications  that  arise  in  the  recursive  generalized  Schur 
algorithm  in  this  chapter. 

The  polynomial  matrix-matrix  product  of  (16)  needs  of  q-p  point  cyclic  convolutions. 
The  polynomial  vector-matrix  produa  of  (ISb)  has  of  scalar  polynomial  products  of  the 
form,  x(z)®^/8(z),  where  x(,z)  is  a  polynomial  with  nonzero  terms  of  1^,  ,  z*.  La 

us  assume  that 

0  <  5i  <  ■  •  <  6/  ^  p  <  5/+1  <  •  •  <  5,  S  r  <  5,+i  <  •  •  <  5|  S  ^  <  5,+i  <  •  •  <  5^. 
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Then 

x'(2)  s  x(z)0^/0(z)  (21a) 

=  [z\+i(z)  +  +  -  •  +  z®'x,+,(z)  +  •  •  +  z\+,(z)]0^/0(z)  (21b) 

=  )  +  •  •  +  (2  )]  0^/0(2  )  (22a) 

+  2®'[x,*,(z)0(zP)  mod  z"**‘]  (22b) 

+  2*'*'[x,*2(2)0(2p)  mod  2"'**]  (22c) 

+  z*'[JC<+i(z)6(2^  mod  z"**’].  (22d) 


The  terms  in  (22a)  do  not  need  to  be  computed  because  these  terms  will  be  summed  to  zeros 
after  adding  all  the  partial  sums  in  the  vector-matrix  multiplication  of  (ISb).  Recall  that  x,-  (2) 
has  degree  n,-,  and  0(z^)  has  degree  Therefore,  the  product  x,  (z)0(z^  from  (22b)  to 

(22d)  can  be  performed  by 

2rt,+l  point  cyclic  convolutions  if  degree[6(zP)]  S  dcgree[x,  (z)]  , 

/t,+p^^"'’'*’‘Vl  point  cyclic  convolutions  if  degree[0(zP)]  <  dcgree[x,  (z)) . 

Remark.  Notice  that  two  d/2  point  convolutions  take  cdlog(d/2)  flops  if  one  d  point  convolu¬ 
tion  takes  cdlogd  flops.  Therefore,  the  polynomial  product  (21)  is  more  efficient  for  the  dis¬ 
placement  operator  with  more  sections,  because  such  di^lacement  operators  break  a  long 
convolution  into  many  smaller  convolutions.  Therefore,  for  a  given  matrix  we  prefer  to  choose 
a  displacement  operator  with  as  many  sections  as  possible,  while  keeping  the  displacement 
rank  minimal.  Also  we  remark  that  the  first  and  last  terms  (22b)  and  (22d)  need  smaller  point 
convolutions. 

If  the  dimensions  of  the  matrix  are  powers  of  2.  then  we  can  always  choose  the  center 
division  point,  r  ^\{p-¥q)l%.  This  balanced  division  (or  douUing)  gives  the  least  number  of 
computations,  in  general.  For  this  case,  let  t)  ■  p-q,  and  r(ii)  denote  the  number  of  compu¬ 
tations  for  one  lecursioa  Then 
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T(n)  ^  2r  (11/2)  +  w  (11).  ly  (n)  s  o  (a^iiiogn). 
and  therefore,  one  can  show  [1]  that  the  /t-step  recursion  takes 

T(k)  ^  Oiahlo^k). 

However,  in  most  cases  the  doubling  is  not  possible,  and  for  such  circumstances,  the 
desirable  choice  of  r  is  such  that  r~p  and  q-r+l  are  highly  composite  numbers  (so  that  fast 
convolution  algorithms  can  be  appliol  efficiently),  as  well  as  r  is  close  to  (9-p)/2  (so  as  to 
achieve  balancing). 

Matrix-Vector  Products  using  Displacement  Representation. 

The  final  step  of  finding  solutions  for  linear  equations  is  the  matrix-vector  multiplication 
5  b,  given  a  displacement  representation  of  S  e  R'"’“ , 

5  =  iAr(Xi.  Ff)K'^{yi ,  F^.  (23) 

(-1 

where  the  length  a  is  a  multiple  of  the  block  size  p,  a  =  pS,  say.  and 

ht  w  M  N 

<-i  «-i  i.i  i-i 

The  expression  in  (23)  can  be  rewritten  in  the  block  displacement  form 
8 

S  =  Ff)Kl{Yi,  F\  Xi  e  T,  e  R"****.  (24) 

(«i 

where 

K^{Xi.  Ff)  =  [Xf,  F^Xi,  Ff%.  ■  •  e  R"**  (25a) 

./=■*)  =  lYi ,  F^ Yi ,  ,  •  f  ]  e  R"’*" .  (25b) 

Furthermore,  because  F^  and  have  Af  and  A/  sections,  re^ctively,  (25a)  and  (25b)  have 

the  fomts 


'  W,.zi,)  0 

'  2^,)  o' 

.  A:p(yi.F*)» 

/p(r/v,..24)  0 
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where  K^QC,  is  the  block  lower  triangular  Toeplitz  matrix  with  the  first  column  block  X. 
The  matrix  O  denotes  a  null  matrix  of  ai^ropriate  size  such  that  ATpCX,-,  pf)  and 
aremxnandnxn  matrices,  respectively. 

To  see  how  to  use  convolutions  for  the  product, 

AfpfX.,  Ff)Kl(Xi.  F*>)h, 

it  is  enough  to  consider  matrix-vector  multi(4ications  of  the  form  K^(X,  Z^b.  Note  that 
ATpCX,  Z^)b  can  be  expressed  as  sum  of  P  products  of  scalar  lower  triangular  Toeplitz  matrix 
and  vectors.  As  an  example. 


bo 

bo 

Co 

bi 

a,  c. 

bt 

Oi  Oo 

0 

Ct  Cq 

0 

02  C2  Oq  Cq 

bz 

02  Oi  Oq 

bz 

+ 

Cz  Cl  Cq 

b-i 

^3 

bz 

Oj  O2  Oq 

0. 

C3  C2  Cl  Cq 

0. 

The  multiplications  in  the  right  sides  of  (26)  can  be  done  by  fast  convolutions,  and  therefore, 
so  can  the  multiplication  5  b. 

5.  C  mcluding  Remarks. 

We  have  presented  0(a^nlog^n)  algorithms  for  the  determination  of  exaa  aivl  least 
squares  solutions  of  linear  systems  with  matrices  having  (generalized)  displacement  rank  a. 
Such  algorithms  for  exaa  solutions  have  been  studied  by  several  authors,  most  recently  by 
Ammar  and  Gragg  [2]  for  Toeplitz  systems.  They  also  made  a  very  close  study  of  the  imple¬ 
mentation  of  the  convolution  operation  in  an  attempt  to  obtain  the  smallest  coefficient;  we  have 
not  attempted  so  close  an  analysis  for  the  more  general  algorithm  in  this  dupter.  Nor  have  we 
attempted  a  numerical  error  analysis  of  the  algorithm;  nevertheless  one  mi^  hope  that  numeri¬ 
cal  refinements  devised  for  the  Schur  algorithm  (see  e.g.,  Kobiadd  and  Lancaster  [17])  may  be 
carried  over  to  the  divide-and-conquer  fnunewotk  as  well. 
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AH>ENDIX 

We  shall  summarize  the  explanation  in  Sec  3  using  a  Pascal-like  recursive  procedure. 
First,  note  thstt  the  polynomial  0p:,(z)  (and  'Pp.^(z))  has  q-p+2  terms.  The  first  column  of 
has  terms  ranging  from  degree  z  to  and  the  other  columns  have  terms  firom  1 

to  Hence,  by  shifting  the  first  column  by  one  position,  we  can  store  0p;,(z)  and  %:,(z) 
in  the  array  "Poly"  from  p  to  q  slots  inclusive; 

Poly:  array  [l..a,  l..a,  0..MAX-1]  of  record 
6:  coefficients; 
coefficients 

end; 

The  computation  of  is  sequential,  i.e.,  once  we  compute  0p.^(z),  we  do  not  need  to 

keep  0p.,_i(z).  and  therefore,  the  array  "Poly"  can  be  kept  as  a  single  global  variable. 

The  polytwmial  vector  Xp.^{z)  has  q-p-^\  terms,  and  therefore,  can  be  stored  in  an  array 
type  GENERATORS: 

type 

GENERATORS  =  array  [l..a,  0..MAX-1]  of  record 
x:  coefficient; 
y:  coefficient 

end; 

However,  Xp.^{z)  cannot  be  kept  as  a  global  variaUe,  and  local  copes  should  be  maintained 
until  we  compute 

Now  we  can  describe  the  recursive  generalized  Sdiur  algorithm  as  follows. 
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Algoriihm  (Recursive  k-step  Generalized  Schur  Algorithm). 

input:  Generator  of  £,  {;fo(z).  di^Iacement  operator  {0^/,  ©^*); 

Number  of  steps,  k. 

Output:  Generator  of  S.  {X^(z). 
procedure  RecursiveSchur 
var 

G.  LoweiG;  GENERATORS: 
begin 

Find(0.  k-l,  G); 

Apply(0.  k.  n.  G.  LoweiG); 
return  (LowetG) 
end 

The  procedure  Find(p,  q.  G)  computes  0^,(7 ).  and  given  Y^.^(w)].  and 

the  procedure  Apply(p.  r.  q.  G.  LoweiG)  returns  UweiG  =  {X,,(z).  r,,,(w))  given  G  = 

procedure  Find(p,  q:  index;  G:  GENERATORS); 
var 

r :  index; 

G,  LoweiG:  GENERATORS; 
begin 

it  p  ssq  then  begin 

Compute  0^^(i)  and  'F^y(w); 
return 


end 


r  :=  appropriate  integer  close  to  r(p+^  V^; 

Find(p.  r-l,  G); 

Apply(p,  r,  q.  G,  LowerG); 

Find(r,  q.  LowerG); 

C**  Use  fast  convolution  for  polynomial  products  *) 

:=  ep:,-l(z)e^^(z); 

end 

procedure  Apply(p.  r,  q;  index;  G:  GENERATORS;  var  LoweiG:  GENERATORS); 
begin 

(*  Use  fast  convolution  for  polynomial  products  *) 

X,,^(2)  :=Xp:,(z)®^/ep3._,(2); 

n=,(w)  := 

LoweiG  :=  ^^^(w)) 

Free  the  storage  of  {Xp.^(z),  T^;,(w)); 
return  (LoweiG); 
end 
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Chapter  6, 

Concluding  Remarks. 

After  a  summary  of  the  main  features  of  our  approach  to  fast  algorithms,  we  shall  briefly 
note  some  areas  for  further  investigation. 

1.  Summary  of  Results. 

We  introduced  two  different  displacements  of  a  matrix  A  e  R"”"  defined  as 

aA  (1) 

or 

Vv*/*  a  fU  -  (2) 

where  and  F*  are  chosen  matrices.  We  call  the  matrices  and  the  Toeplitz 

and  Hankel  displacements  of  A  with  respect  to  the  displacement  (^rotors  {F^ ,  F* ),  respec¬ 
tively.  We  say  that  the  matrix  A  is  structured  if  has  low  rank  (close-to-Toeplitz)  or 

has  low  rank  fclose-to-Hankel),  where  "low"  is  with  reference  to  the  dimensions  of 
A .  The  ranks  of  the  displacements  and  are  called  the  displacement  ranks 

of  the  matrix  A .  The  computational  complexity  (as  well  as  the  space  comidexity)  of  fast  algo¬ 
rithms  is  proportional  to  the  displacement  rank  of  the  matrix.  Therefore,  the  displacement 
operators  should  be  chosen  so  that  the  disfdacements  have  ranks  as  low  as  possible. 
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Invariance  of  Displacement  rank  under  Schur  complementations. 
A  basic  property  is  the  following.  Let  the  block  matrix 


A  B 
^  CD 

have  Hankel  (Toeplitz)  displacement  rank  a  with  respect  to  lower  triangular  displacement 
operators,  [F^ ,  F* ).  Then  the  matrix. 


O  O 
O  S 


,  S  =  D  -  CA~^B ,  the  Schur  complement  of  M  with  re^ct  to  A , 
has  the  same  Hankel  (Toeplitz)  displacement  rank  a  with  respect  to  [F^ ,  F** ).  This  result  has 
many  consequences.  For  example,  by  considering  the  matrices 


r.4  /' 

1 

1 _ 

■  7  ^1 

1 

o 

_ 1 

• 

1 

o 

_ 1 

f 

1 

o 

_ 1 

we  can  see  that  the  inverse  A~^  and  the  product  A^A  also  have  low  displacement  rank  if  A 
has  low  displacement  rank. 

The  displacement  structure  of  a  matrix  is  captured  by  its  generators  viz.,  a  matrix  pair  X , 
Y  that  satisfies  the  displacement  equations 


=Xy^.  or  =XY^,  X  e  Y  e 

The  inverses  and  Schur  complements  of  structured  matrices  can  be  obtained  by  operating  on 

their  generators.  Doing  so  requires  0{amn)  computations,  where  a  is  the  displacement  rank 
of  A ,  whereas  working  with  the  matrix  A  itself  requires  0{mn^)  or  0(m^n)  computations. 

We  described  such  algorithms  for  QR  factorization,  inversion,  regularization,  and  solution 
of  least  squares  problems.  The  key  to  obtaining  these  results  is  a  combination  of  (efficient) 
algorithms  for  successive  Schur-complementation  applied  to  certain  judiciously  chosen  "compo¬ 
site"  (block)  matrices. 


We  shall  briefly  outline  the  resulting  generalized  Schur  algorithms. 


-  138  - 


Generalized  Schur  Algorithms. 

To  efficiently  compute  the  Schur  complement  D  -  CA~^B  of  the  matrix 

[^4  B 
^  ^  CD 

we  first  need  to  obtain  a  proper  generator  for  Af ,  i.c.,  a  generator  of  the  form, 

*  0 - ol  r*  0 - o] 

X=  ,  Y  = 

*  * -  *  * 

for  the  Toeplitz  displacement  (Chapter  2),  and 


for  the  Hankel  displacement  (Chapter  4).  A  non-proper  generator  of  A  can  be  converted  to  a 
proper  one  in  several  ways.  One  is  by  applying  the  following  matrices, 

1 

C  S2 

-s,  c  '  C^  +  S,S2=l  (5) 

1 

which  can  be  used  to  null  out  different  entries  by  appropriate  choices  of  {c,  s,,  S2}-  Proper 
generators  (3)  or  (4)  can  be  obtained  by  using  the  matrix  (5),  or  its  special  cases:  Givens  rota¬ 
tions,  hyperbolic  rotations  and  elementary  matrices.  By  post-multiidying  X  and  Y  with  a 
sequence  of  appropriate  matrices  5,^,  we  can  transform  X  and  Y  to  proper  form  with  0(am) 
computations. 

The  next  step  is  to  modify  one  column  of  X  and  T.  Repeating  the  same  step  (i.e.. 


transformation  to  proper  form  followed  by  a  modifications  of  the  column)  r  times,  where  r  is 
the  size  of  the  square  block  A  in  M  produces  the  generator  of  the  Schur  complement 
D-CA-^B. 

Displacement  Representation  of  Composite  Matrices  (Giapter  3). 

The  above  algorithms  can  be  applied  to  the  block  matrices 

r  /  T^T  I  T2  jTj  jT  jTj  jT 

^  ^\_i  o  '  I  oy  tI  o  '  [  T  /  J’  [  /  o  y 

to  obtain  generalized  Gohberg-Semencul  formulas  or,  equivalently,  displacement  representa¬ 
tions,  of  the  matrices, 

T~\  rj7T*r2,  r(r^r)"‘r^, 

Fast  Matrix  Factorizations  and  Solutions  of  Linear  Equations  (Chapters  2  and  4). 

Another  use  of  the  generalized  Schur  algorithm  is  to  obtain  fast  solutions  for  various 
equations  of  structured  matrices  such  as  Toeplitz  and  close-to-Toeplitz  matrices,  Hankel  and 
close-to-Hankel  matrices,  and  Vandermonde  matrices. 

Divide-and-Conquer  Implementations  (Chapter  5). 

It  turns  out  that  the  generalized  Schur  algorithm  can  be  easily  implemented  in  divide- 
and-conquer  fashiott 

2.  Some  Known  and  Unknown  Numerical  Properties  of  the  fast  Algorithms. 

In  this  section,  we  shaU  present  a  rather  casual  description  of  the  numerical  properties  of 
the  fast  algorithms  considered  in  this  thesis. 

2.1  Levinson  Algorithm. 

The  Levinson  algorithm  was  analyzed  by  Cybenko  [6].  Bimch  [4]  clarified  Cybenko’s 
work  by  introducing  the  concept  of  three  different  numerical  stabilities. 


-  140- 


Stability:  An  algorithm  for  solving  linear  equations  is  stable  for  a  class  of  matrices  M  ffor 
each  i4  e  M  and  for  each  b  the  computed  solution  it  to  Ax  =  h  satisfies  /C  i  =  b  for  some  A 
and  b,  where  A  is  close  to  A  and  b  is  close  to  b. 

Strong  stability:  i4n  algorithm  for  solving  linear  equations  is  strongly  stable  for  a  class  of 
matrices  M  if  for  each  A  e  M  and  for  each  b  the  computed  solution  *  to  Ax  =  b  satisfies 
AH  =  6,  where  A  e  M  and  A  and  6  are  close  to  A  and  b,  respectively. 

Weak  stability:  An  algorithm  for  solving  linear  equations  is  weakly  stable  for  a  class  of 
matrices  M  if  for  each  well-conditioned  >1  e  M  and  for  each  b  the  computed  solution  ft  to 
Ax  =  b  is  such  that  I Ix-itI I/I ixl I  is  small. 

According  to  the  above  definitions,  strong  stability  implies  stability  and  stability  implies  weak 
stability. 

Sometimes  it  is  hand  to  prove  that  an  algorithm  is  stable.  Bunch  pointed  out  that 
Cybenko  only  proved  that  the  Levinson  algorithm  is  weakly  stable.  To  see  this  let  us  consider 
Cybenko’s  result  and  the  classical  perturbation  theorem  (see  e.g.  [18]). 

Cybenko’s  result:  The  computed  solution  ft  to  Tx  =  b  will  always  have  a  small  residual, 
r  mTft-b  for  "well-conditionetT  symmetric  positive  Toeplitz  matrices. 

Perturbation  Theorem:  If  Ax  =  b  and  A!t  =  b.  where  A  is  nonsingular,  and  if 
WA-A  II•1U“*II  <  1,  then  A  is  nonsingular  and 

llx-»ll  ^  k(A)  [  \\A-A\\  .  llb-611  ' 

11*11  ,  lull  llbll  • 

From  Cybenko’s  result  and  the  above  theorem,  it  is  easy  to  see  that  the  Levinson  algorithm  is 
weakly  staUe,  because 
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li«-xll 

llxll 


<K(r) 


llrll 

iibir 


If  Cybenko  had  shown  that  r  =  Tk  -  b  is  small  for  all  symmetric  positive  Toeplitz  matrices 
(including  ill-conditioned  matrix)  then  he  would  have  proven  that  the  Levinson  algorithm  is 
stable.  It  is  unknown  [4]  whether  the  Levinson  algorithm  for  symmetric  positive-definite  Toe¬ 
plitz  matrices  is  stable  (in  Bunch’s  sense)  or  not 


2.2  Schur  Algorithm. 

Numerical  properties  of  the  (generalized)  Schur  algorithm  are  also  irot  fully  understood 
yet.  However,  it  would  not  be  a  wild  conjecture  that  the  Schur  algorithm  is  also  (at  least) 
weakly  stable  because  the  Schur  algorithm  and  the  Levinson  algorithm  are  closely  related  (see 
Chapter  2).  To  compare  the  Schur  algorithm  with  the  Cholesky  algorithm,  we  generated  the 
ill-conditioned  positive-definite  Toeplitz  matrix  (See  Appendix  for  how  to  generate  ill- 
conditioned  positive-definite  Toeplitz  matrices). 


1 

.99 

.999602 

.98922 

.99847 

.99 

1 

.99 

.999602 

.98922 

.999602 

.99 

1 

.99 

.999602 

.98922 

.999602 

.99 

1 

.99 

.99847 

.98922 

.999602 

.99 

1 

K(r)  =  2.9  X  lO’. 


Let  Lg  and  L,  denote  the  lower  triangular  factors  of  T  computed  with  single  precision  arith¬ 


metic  by  the  Cholesky  algorithm  and  the  Schur  algorithm,  respectively.  Our  simulation  results 


show  that 


II  r 112  =  7.5  X  KT*.  II  r 112  =  8.9  X  10-*.  (6) 

Therefore,  we  cannot  exclude  the  possilniity  that  the  Schur  algorithm  is  even  stable  in  Bunch’s 

sense  (see  [10]  also).  Moreover  certain  "thresholding  strategy"  in  which  "anal!"  reflection 

coefficients  are  set  equal  to  zero  can  substantially  improve  the  numerical  behavior  of  the  Schur 

algorithm  [3],  [13]. 
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2  J  Troubles  associated  with  large  Reflection  Coefficients. 

Cybenko  obtained  posteriori  bounds  on  the  condition  number  of  Toeplitz  matrices  in 
terms  of  the  reflection  coefficients  [6]  (see  [13]  for  a  more  recent  result).  For  close-to-Toeplitz 
matrices  large  reflection  coefflcients  indicate  a  large  condition  number  of  the  matrices. 

It  is  well-known  [2],  [17]  that  hyperbolic  rotations  with  large  reflection  coefflcients  give 
numerically  poor  results.  To  see  this  consider  a  single  2x2  hyperbolic  rotation; 

ck  -sh 

[a'.b']  =  [a.b]  «.  H=  .  (7) 

where 

SL  =  [ai,  02,  ■  ■  ,  a^f ,  b  =  [6,.  *2.  •  •  .  Ia,l  >  Ihil, 

a'  =  [a,'.  a2\  •  • .  a,'f,  b'  =  [0.  h/.  •  • . 

Let  us  slightly  perturb  the  ‘data’  a  and  b  by  Pi  and  P2>  of  Hi*  the  result¬ 

ing  relative  perturbation  in  the  ‘results’: 

[a'(l+Tli),  b'(l+ti2)]  =  [a(l+p,).  b{l+p2))  /f. 

Then  clearly 

cAp,  -sAp,' 

[ili«'^2*»']  =  [«’bj[-.Ap2  cAp2j- 

Therefore 

ll[tl,a'.  Ti2b']llf  S  [2(cA2  +  sA2)p2]^.||[a,  b]ll/r  =  2\-  •ll[a,  b]ll/r.  (9) 

where  p  =  max(lpil,  IP2I),  and  A  is  the  reflection  coefflcient,  k  =  shlch  =  bilai.  From  (9), 
one  can  see  that  a  hyperbolic  rotation  gives  an  inaccurate  result  if  the  reflection  coefficient  k  of 
H  is  close  to  ±1.  i.e..  a/  is  close  to  zero  (which  is  not  surprising  because  of  the  large 
diflerence  in  the  two  eigenvalues  of  H). 

To  see  the  numerical  difficulty  associated  with  large  reflection  coeffidents,  we  generated 
a  positive-definite  Toeplitz  matrix  with  the  following  reflection  coefficients  (see  Appendix), 
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K  =  [0.  0.99999,  -0.99999,  0.99999,  -0.99999]. 

The  resulting  matrix  has  condition  number  ic(7')  =  6.6  x  lO'^.  When  we  applied  the  Schur 

algorithm  with  double  precision  arithmetic  to  re-compute  the  reflection  coefflcients,  we 

obtained 

K  =  [0,  0.99999,  -0.99998999999710,  0.99998958118522,  -0.94450034224923] 
which  was  quite  different  from  the  "tnie”  reflection  coefflcients.  For  some  problems  (e.g., 

orthogonal  Alter  synthesis  [16]),  what  we  need  is  only  the  reflection  coefficients  rather  than  the 

factorization. 


2.4  Ill-conditioned  Matrices  can  have  small  Reflection  Coefficients. 


A  ill-conditioned  matrix  does  not  always  have  large  reflection  coefficients.  To  see  this  let 
us  deflne  the  two  matrices 


1  0.5  0.2' 

10  0' 

T  = 

0.5  1  0.5 

.  s  = 

0  1  0 

.0.2  0.5  1  . 

.10*  0  1. 

One  can  check  that 


A  mSTS^  = 


1  0.5  10*40.2 

0.5  1  5x10^40.5 

10*40.2  5x10^40.5  10000040001 


LL^, 


where 


(10) 


L 


1 

0.5 

10*40.2 


0 

VO?75 

0.4 


0 

0 

^  75 


The  condition  numbers  of  T  and  A  are 


ic(r)  =  4.7,  ic(A)=  1.5X  10“. 

The  matrix  A  also  has  displacement  rank  2  because  S  is  lower  triangular  Toe|ditz  (see  Chapter 
3  or  [12]).  Furthermore,  the  matrices  A  and  T  have  the  same  reflection  coefficients. 
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^  =  [0,1/2.-1/151.  (II) 

Let  Lj  and  denote  the  lower  triangular  factors  of  A  computed  with  double-precision  arith¬ 
metic  by  the  Schur  algorithm  and  the  Cholesky  algorithm,  respectively.  Our  simulation  experi¬ 
ment  gives  rather  surprising  result: 

IlL -Z;il  =  2.2x  ir“.  IlL -LJI  =  6.9x  Iff-’,  (12a) 

lU  -  L,L^\\  =  2.2  X  lOr**  lU  -  4^11  =  1-1  x  (12b) 

As  far  as  the  bounds  in  (12a)  are  concerned  the  Schur  algorithm  performs  far  better  than  the 

Cholesky  algorithm  (beware  of  the  matrix  products  575^,  however). 


2  J  Generalized  Schur  Algorithm  and  Sweet’s  Algorithm. 

Sweet’s  algorithm  [19]  behaves  quite  differently  from  the  Schur-type  fast  QR  factoriza¬ 
tion  algorithms.  Let  and  the  matrices  computed  by  Sweet’s  algorithm,  and  let  Q, 
and  be  the  matrices  computed  by  our  algorithm.  The  simulation  result  by  Luk  and  Qiao 
[IS]  for  the  following  matrix. 


T  =  — 
27 


shows  that 


27  9  3  -23+r 

9  27  9  3 

3  9  27  9 

-23+f  3  9  27 


t  -  i(r’ 


(13) 


.1  I  ( 


iia^,-7’iif/iirii  =  3xi(r“.  iiG/c2,-/ii/r/ii/iif  =  0.48. 

7116  inaccurate  Q,  might  be  understood  along  the  lines  that  (i)  T^T  is  ill-conditioned,  (ii)  the 

last  reflection  coefficient  is  very  close  to  one  and  (iii)  our  fast  QR  factorization  algorithm,  in 

fact,  computes  the  partial  triangulatization  (Chap  2)  of 

j*T  Y 

Y  /  • 
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Although  Sweet’s  algorithm  worics  well  for  the  ill-conditioned  matrix  (13),  Luk  and  Qiao 
have  shown  that  Sweet’s  algorithm  did  badly  for  the  following  well-conditioned  matrix 
(K(r)  =  5.6), 

8  4  2  1-r 

,  4  8  4  2 

^  =  ^  2  4  8  4 

1-r  2  4  8 

The  simulation  result  with  our  fast  algorithm  for  the  matrix  T  gives  a  very  accurate  result 
probably  because  T^T  does  not  have  large  reflection  coefflcients. 

2.6  Numerically  Stable  Fast  Algorithm  for  Indefirate  Matrices. 

For  indefinite  matrices,  the  leading  principal  submatrices  can  be  (close-to)  singular.  It  is 
easy  to  find  a  well-conditioned  indefinite  matrix  for  which  the  Schur  or  Levinson  algorithm 
perform  badly.  There  ate  some  previous  works  [5]  that  might  be  useful  for  finding  a  numeri¬ 
cally  stable  getKralized  Schur  algorithm. 

3.  Other  Open  Research  Problems 

3.1  Indefinite  Structured  Matrices. 

There  are  no  doubt  other  special  algorithms  that  can  be  put  into  array  form  in  the  way  we 
have  described.  In  particular,  we  might  mention  algoriduns  for  determining  the  root  distribu¬ 
tion  of  polynomials  by  studying  the  inertia  of  certain  related  matrices  called  Bezoutians  (see 
e.g.  [14]).  One  feature  of  such  matrices  is  that  they  may  not  be  strongly  regular  (strongly  non¬ 
singular)  in  the  sense  that  not  all  leading  minors  may  be  mmzero.  This  has  been  an  often  tacit 
assumption  in  most  of  this  thesis.  Solution  methods  are  known  for  some  of  these  problems, 
especially  in  the  Hankel  case,  and  it  would  be  interesting  to  examine  them  ftom  our  point  of 


view. 
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5.2  Array  form  of  the  RLS  Algorithms. 

Another  area  of  exploration  would  be  to  see  how  to  obtain  array  forms  for  the  various 
fast  lattice  and  transversal  filter  RLS  algorithms. 

5.5  Accelerating  the  convergence  of  the  Schur  Algorithm. 

The  Schur  algorithm  is  also  widely  used  for  the  spectral  factorization  (see  e.g.  [9]).  For 
such  applications,  it  would  be  useful  to  find  a  way  to  accelerate  the  convergence  of  the  algo¬ 
rithm. 

3.4  Doubly  Structured  Matrices. 

In  many  signal  processing  applications,  we  encounter  so-called  "doubly  structured" 
matrices  (e.g.  block-Toeplitz  matrix  with  Toeplitz  blocks).  All  known  fast  algorithms  do  not 
fully  utilize  this  additional  structure. 

5  J  Low  displacement  rank  Decompositions. 

Let  A  and  B  have  displacement  tanks  Oj  and  02.  respectively.  By  considering  the  matrix 

/  B 
A  O 

one  can  check  that  the  matrix  AB  has  displacement  rank  less  than  or  equal  to  at  -f  02  +  1. 
For  a  given  matrix  M  with  displacement  rank  a,  is  it  possiUe  to  fiixl  matrices  Mi  and  M2, 
whose  displacement  ranks  are  close  to  a/2,  such  that  M  =  MiMfl 


APPENDIX 

Given  a  sequence  of  reflection  coeflicients 
X  *  [ko,  k|,  ^2*  *  ’  •  ^1*  ^0  * 

we  can  generate  [7],  [11]  a  symmetric  positive-definite  Toeplitz  matrix  using  the  transmission 
line  shown  in  Fig  I.  We  excite  the  quiescent  transmission  line  0.e.  zero  initial  condition)  with 
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the  impulse  sequence 

[1.  0.  0.  ■  •  ]. 

Then  the  output  sequence  (see  Fig  1), 

[Cq,  Cl,  C2f  •  •  •  ].  Cq  =  1 

gives  the  desired  Toeplitz  matrix 
T  =  (c._,). 

Because  only  orthogonal  rotations  are  used,  this  procedure  is  numerically  stable  no  matter  how 
large  the  reflection  coefficients  are. 


C3  C2  Cl 


Fig  1.  Generating  Toeplitz  Matrix  given  Reflection  Coefficients. 
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